## Abstract

Motivated by the cost savings that can be obtained by sharing resources in a network context, we consider a stylized, yet representative, model for the coordination of maintenance and service logistics for a geographic network of assets. Capital assets, such as wind turbines in a wind park, require maintenance throughout their long lifetimes. Two types of preventive maintenance are considered: planned maintenance at periodic, scheduled opportunities, and opportunistic maintenance at unscheduled opportunities. The latter type of maintenance arises due to the network context: When an asset in the network fails, this constitutes an opportunity for preventive maintenance for the other assets in the network. So as to increase the realism of the model at hand and its applicability to various sectors, we consider the option of not-deferring and of deferring planned maintenance after the occurrence of opportunistic maintenance. We also assume that preventive maintenance may not always restore the condition of the system to ‘as good as new.’ By formulating this problem as a semi-Markov decision process, we characterize the optimal policy as a control limit policy (depending on the remaining time until the next planned maintenance) that indicates on the one hand when it is optimal to perform preventive maintenance and on the other hand when maintenance resources should be shared if an opportunity in the network arises. In order to facilitate managerial insights on the effect of each parameter on the cost, we provide a closed-form expression for the long-run rate of cost for any given control limit policy (depending on the remaining time until the next planned maintenance) and compare the costs (under the optimal policy) to those of suboptimal policies that neglect the opportunity for resource sharing. We illustrate our findings using data from the wind energy industry.

## 1 Introduction

High-value capital assets, such as energy systems (for example, wind turbines), medical systems (for example, interventional X-ray machines), lithography machines in semiconductor fabrication plants, and baggage handling systems at airports require maintenance throughout their (long) lifetimes. Such capital assets are crucial to the primary processes of their users/operators and unexpected failures may have very significant negative impacts and even life threatening consequences. In order to avoid or to minimize failures, asset owners perform preventive maintenance activities, with the objective to retain or to restore a system back to a satisfactory operating condition. The costs of both these maintenance activities, and of their respective unscheduled downtimes, represent one of the key drivers of an organization’s total costs. Such maintenance costs constitute up to 70% of the total value of the end product [4, 22], and this percentage is rapidly increasing [44]. Hence, there is great incentive for asset owners to optimize the maintenance planning.

The most common maintenance practices are the so-called *corrective maintenance* and *planned maintenance*. The former, as the name suggests, proposes the repair of the asset upon failure, while the latter proposes a fixed service schedule for the field service engineers with the objective of ensuring that the asset operates correctly and of avoiding any unscheduled breakdown and downtime. The cost of planned maintenance is relatively low in comparison to that of corrective maintenance, due to its planned, anticipated nature. Planned maintenance is characterized by its scheduled downtimes (contrary to the unscheduled downtime experienced at a failure, which leads to a corrective maintenance) with fixed inter-scheduled instances, say at instances \(\tau ,2\tau ,3\tau ,\ldots \), (for example, \(\tau = 6\) months). Such instances constitute the *scheduled opportunities* of preventive maintenance.

In the context of a network of assets, such as a wind park or a network of hospitals in close geographic proximity (from the viewpoint of the service provider), there is a second type (in addition to the above scheduled instances) of opportunity to perform preventive maintenance. In the event that a failure occurs, its corrective maintenance instance can be viewed as an unscheduled opportunity for preventive maintenance for the other assets in the network. In these instances, *opportunistic maintenance* can take place, with the respective instances constituting the *unscheduled opportunities* of preventive maintenance. This form of network dependency can be viewed on two levels: (i) the economic dependency between the various systems of a network, and (ii) the structural degradation and failure dependencies. Similarly to planned maintenance, opportunistic maintenance has a lower cost in comparison to that of corrective maintenance.

Incorporating opportunistic maintenance may also affect the scheduling of planned maintenance, as it might be beneficial to defer the planned maintenance opportunity to take place after a period of length \(\tau \) after the occurrence of an opportunistic maintenance. This decision of *deferring or not the scheduling of planned maintenance* after the occurrence of opportunistic maintenance may have a positive or negative effect on the total costs.

In maintenance, it is oftentimes assumed that a maintenance activity is perfect, i.e., it restores the system to a state of ‘as good as new.’ However, this assumption may not be true in practice. For instance, a misidentification of the root cause of the (imminent) failure can lead to an erroneous repair not resolving the actual issue, or some minor repair activity (such as exchange of parts, changes or adjustment of the settings, software update, lubrication or cleaning, etc; see [34]) not restoring the system to a state of ‘as good as new.’ In the above-mentioned cases, it is more reasonable to assume that the system is restored to a state between ‘as bad as old’ and ‘as good as new.’ This concept will be referred to as *imperfect maintenance*. Evidently, this assumption impacts the resulting cost. Hence, knowledge regarding the degree of how successful a maintenance activity is should not be ignored in the maintenance planning.

In conclusion, asset owners are oftentimes faced with the following questions:

- (i)
What is the advantage of incorporating planned maintenance in comparison to exercising only corrective maintenance?

- (ii)
What is the benefit of sharing resources in the network (in the form of incorporating opportunistic maintenance in addition to the planned maintenance)?

- (iii)
What is the influence of deferring the planned maintenance after the occurrence of opportunistic maintenance?

- (iv)
What is the influence of imperfect maintenance on the maintenance planning and on the costs (long-run rate of cost)?

- (v)
When should preventive maintenance be performed (so as to minimize the long-run rate of cost)?

### 1.1 Main contributions

We consider a stylized, yet representative, model that incorporates the above-mentioned characteristics, and we prove the existence of the optimal maintenance policy and we derive its structure. Furthermore, we compute an explicit expression for the long-run rate of cost, which can be easily used by asset owners and service providers so as to gain further insights into their practice and so as to compute the cost-benefits of changing their maintenance practice. More concretely, the main contributions of the paper are threefold: (1) We consider a semi-Markov decision process that incorporates planned and opportunistic maintenance, as well as imperfect maintenance. From the analysis of the semi-Markov decision process stems the characterization of the optimal policy as a control limit policy (threshold) depending on the time until the next planned maintenance opportunity. Moreover, using this approach, we are able to derive a closed-form expression for this control limit. (2) Considering the class of control limit policies (depending on the remaining time until the next planned maintenance), we derive, using the theory of regenerative processes, an explicit expression for the long-run rate of cost. (3) We consider data from the wind energy industry and provide, based on these values, concrete answers to Questions (i)–(v) mentioned above. More specifically, we analyze the benefit of using planned and opportunistic maintenance compared to only corrective maintenance. We also analyze the influence of deferring planned maintenance after the occurrence of opportunistic maintenance. Finally, we also highlight the cost savings that can be attained by reducing the probability of an imperfect maintenance.

### 1.2 Outline of this paper

The remainder of this paper is structured as follows: In Sect. 2, we review the related literature. In Sect. 3, we describe in detail the model at hand, which captures the condition of the asset and which incorporates imperfect maintenance at scheduled and unscheduled maintenance opportunities. Subsequently, in Sect. 4, we characterize the structure of the optimal policy for condition-based maintenance using the average cost criterion, see Sect. 4.1, and we compute the long-run rate of cost for any policy with the same structure as the optimal policy (i.e., the class of control limit policies depending on the remaining time until the next planned maintenance), see Sect. 4.2. In Sect. 5, we permit the deferral of planned maintenance after the occurrence of opportunistic maintenance, and we compute the long-run rate of cost. A numerical illustration is provided in Sect. 6, where, based on data from the wind energy industry, we compare the long-run rate of cost for various policies, we show the effect of imperfect maintenance, and the effect of deferring planned maintenance. Finally, Sect. 7 contains concluding remarks and highlights directions for future research.

## 2 Literature review

Maintenance optimization models have been extensively studied in the literature. Optimal maintenance policies aim to provide optimal system reliability/availability and safety performance at lowest possible maintenance costs [27]. Due to the fast development of sensing techniques in recent years, the state of a capital asset can be monitored or inspected at a much lower cost and in a continuous fashion, which facilitates condition-based maintenance. Condition-based maintenance recommends maintenance actions based on information collected through online monitoring of the capital asset and it can significantly reduce maintenance costs by decreasing the number of unnecessary maintenance operations; see, for example, Jardine et al. [10], Peng et al. [26] and Lam and Banjevic [18]. The condition-based maintenance model that we propose builds on the delay time model proposed by Christer [6] and Christer and Waller [5]. We refer the reader to Baker and Christer [2], Christer [7] and Wang [38], and, more recently, Wang [39] for an overview on delay time models. Not only are delay time models well-known in the literature, but they also very frequently appear in practice.

Practice-based research with real diagnostic data, such as data related to the spectrometry of oil (for example, [16, 21]) and data related to vibrations (for example, [40]), showed that it is usually sufficient, and even preferable from a modeling and decision-making perspective, to consider only two operational states. The first state is the perfect state, in which the system lasts from newly installed to the point that a hidden defect has been identified. After the occurrence of a hidden defect in the system until the occurrence of a failure (which is typically referred to as the delay time), the system resides in the second state, also referred to as the satisfactory state. Such a classification of the operational states has the property that maintenance actions are initiated only when the system is degraded to the state that can actually lead to a direct failure, i.e., the satisfactory state, but not when the system is functioning perfectly, i.e., the perfect state. The vast majority of the literature on delay time models is restricted to numerical methods or approximations to solve the models at hand, due to their underlying complexity. A few recent exceptions are Maillart and Pollock [20], Kim and Makis [17] and Van Oosterom et al. [36], who study two-state systems under periodic inspection, partial observability, and postponed replacement, respectively, and provide analytical results regarding the structure of the optimal policy. However, none of them consider the option of resource sharing in the network (in the form of opportunistic maintenance), nor do they incorporate the notion of imperfect repair.

Most delay time model analyses assume that the system after a maintenance action is restored to a state of ‘as good as new.’ Contrary to this assumption, in imperfect maintenance it is assumed that, upon preventive maintenance, the system lies in a state somewhere between ‘as good as new’ and ‘as bad as old.’ This is first introduced by Nakagawa [23, 24] and is called the (*p*, *q*)-rule. Under the (*p*, *q*)-rule, the system is returned to an ‘as good as new’ state (perfect preventive maintenance) with probability *p* and it is returned to the ‘as bad as old’ state (minimal preventive maintenance) with probability \(q = 1 - p\) after preventive maintenance. Clearly, the case \(p = 0\) corresponds to having no preventive maintenance. Also, from a practical point of view, imperfect maintenance can describe a large set of realistic maintenance actions [27].

When planning condition-based maintenance strategies, see, for example, Jardine et al. [10], Jardine and Tsang [11] and Prajapati et al. [28], a typical assumption in the literature is that the system at hand is monitored continuously and one can intervene and maintain the system at any given moment. However, due to accessibility reasons (for example, in the case of off-shore wind parks) or for cost reduction purposes, it is cost optimal and more practical to allow only for discrete time opportunities. The simplest among the discrete time opportunities are the periodic planned maintenance instances (also referred to as scheduled downs), with period, say, \(\tau \), that serve as a scheduled opportunity to do maintenance for a network of systems. Furthermore, unplanned maintenance instances (due to opportunistic maintenance) can be modeled as discrete instances occurring according to a multi-dimensional counting process.

For recent works related to opportunistic maintenance, the interested reader is referred to Zhu et al. [42, 43], Arts and Basten [1] and Kalosi et al. [14]. In Zhu et al. [43] and Zhu et al. [42], the authors consider a single-unit system and account for both scheduled and unscheduled opportunities. In these analyses, the authors model the age and the condition, respectively, of the system and derive, based on approximations, the long-run rate of cost under a given policy. In both papers, the arrivals of unscheduled opportunities are modeled according to a homogeneous Poisson process. This approximation is justified by the Palm–Khintchine theorem [15], which states that even if the failure times of some systems do not follow exponential distributions, the superposition of a sufficiently large number of independent renewal processes behaves asymptotically like a Poisson process. Arts and Basten [1] build further on Zhu et al. [42, 43], but they only consider scheduled maintenance opportunities (excluding unscheduled opportunities). Furthermore, Arts and Basten [1] assume that at a scheduled opportunity, the system is restored to a perfect condition (i.e., \(p=1\)), while at a failure they assume that the system is restored to a state which is stochastically identical to the state just prior to the system’s failure. In a recent conference paper, Kalosi et al. [14] looked at a model with both planned and unplanned maintenance opportunities, at which the system is restored to a perfect condition, showing some preliminary results that a control limit policy (depending on the remaining time until the next planned maintenance) is optimal.

In contrast to Arts and Basten [1] and to Zhu et al. [42, 43], in which the long-run rate of cost is computed for a given policy, we first characterize the structure of the optimal policy explicitly and thereafter, for the optimal policy class, we compute the long-run rate of cost. Furthermore, we include both scheduled and unscheduled maintenance opportunities. In contrast to Kalosi et al. [14], we extend the model by incorporating the (*p*, *q*)-rule, making it more generic and realistic. Moreover, we are the first to analyze the influence of deferring planned maintenance and we illustrate the financial effects of the maintenance policy in a realistic context using data stemming from the wind industry.

## 3 Model description

We consider a single-unit system (equivalently, a component or asset) that is monitored continuously and whose condition is fully observable. We assume that the condition of the system degrades over time and that it can be modeled according to a delay time model. That is, the states are classified as *perfect*, *satisfactory* and *failed*. We shall refer to the state of perfect condition as state 2, the state of satisfactory condition as state 1 and the failure state as state 0. Furthermore, we assume that as soon as a system failure occurs, the system is instantaneously replaced by an ‘as good as new’ system. So, in the mathematical formulation of the model, we may assume, due to the instantaneous replacement at failure, that the model evolves between only states 1 and 2. The system spends an exponential amount of time with rate \(\mu _i\) in state *i*, \(i\in \{1,2\}\). The above model formulation implies that initially the system starts in state 2 (perfect state), then after an exponential amount of time with rate \(\mu _2\), the system deteriorates and the condition of the system goes to state 1 (satisfactory state). The system spends an exponential amount of time with rate \(\mu _1\) in state 1, after which a failure occurs. At a failure, the system is instantaneously replaced by an ‘as good as new’ system and the condition is restored to 2 (perfect state). A schematic evolution of the condition of the component and the corresponding times of transitions is depicted in Fig. 1.

We assume that we have two types of opportunities at which we can perform preventive maintenance (PM) before failure: the scheduled and the unscheduled opportunities. The scheduled opportunities correspond to pre-arranged opportunities occurring according to a fixed schedule. These opportunities can be attributed to either service/maintenance agreements or to regulation imposition checks. We assume that the scheduled opportunities occur at epochs \(\tau ,2\tau ,3\tau ,\ldots \), with \(\tau >0\). This is also in accordance with what happens in practice as maintenance actions, once planned, are typically not rescheduled. The unscheduled opportunities correspond to random opportunities triggered by failures of other systems in close proximity. We assume that these unscheduled opportunities occur according to a Poisson process at rate \(\lambda \).

The unscheduled and scheduled opportunities, abbreviated by USO and SO, respectively, serve as opportunities to perform preventive maintenance. Such preventive maintenance is assumed to cost less than a corrective maintenance (CM) upon failure, which costs \(c_{\text {cm}}\). Moreover, incorporating a planning perspective, we may assume that the preventive maintenance cost at an SO, \(c_{\text {pm}}^{\text {so}}\), is less than or equal to the corresponding cost at a USO, say \(c_{\text {pm}}^{\text {uso}}\), that is \(0<c_{\text {pm}}^{\text {so}}\le c_{\text {pm}}^{\text {uso}}<c_{\text {cm}}\) (however, we also extend our analysis to the case \(c_{\text {pm}}^{\text {so}}> c_{\text {pm}}^{\text {uso}}\)). Following the (*p*, *q*)-rule of Nakagawa [23, 24], we assume that after preventive maintenance a system is returned to the ‘as good as new’ state with probability \(p\in (0,1]\) and returned to the ‘as bad as old’ state (i.e., the amount of time left until the failure has not altered) with probability \(q=1-p\).

Our aim is to determine a policy for when to perform preventive maintenance on the system based on its condition and the opportunity type, i.e., scheduled or unscheduled. More explicitly, we will need to formally define the state space, which refers to the condition of the system, the action space and the decision epochs. The state space is governed by the process depicting the condition of the system, i.e., the Markov chain evolving between the states \(\{1,2\}\). The action space consists of only two actions: perform preventive maintenance or do nothing. Lastly, the decision epochs are the SO and USO epochs. In Fig. 2, we depict the SO epochs by (\(*\)) and the USO epochs by (o).

Table 1 summarizes the abbreviations that we will use throughout the remainder of this paper.

## 4 Optimal policy

The goal of this section is twofold: We first characterize the structure of the optimal average cost condition-based maintenance policy. We then derive an explicit form for the long-run rate of cost per time unit for any given policy that has the same structure as the optimal policy.

### 4.1 Average cost criterion

This section is devoted to the derivation of the optimal policy for when to perform preventive maintenance for the system at hand using the average cost criterion. To this purpose, we set up our problem as a (controlled) semi-Markov decision process. Due to the stochastic nature of the problem, it does not suffice to know the type of the decision epoch (SO or USO), but it is also required to keep track of the remaining time till the next SO. That time may impact our decision, i.e., the optimal policy may depend on the residual time till the next SO. Thus, for the full description of the condition (state) of the system, we use a triplet descriptor

where *i* indicates the condition of the system. If \(j=\text {SC}\), then this means that the condition of the system is about to change and there is no decision associated with this epoch, while if \(j=\text {SO}\) or \(j=\text {USO}\), this means that this is a decision moment at either a scheduled (SO) or unscheduled opportunity (USO), respectively. Finally, the third element indicates the remaining time until the SO. Note that if \(j=\text {SO}\) then \(t=0\). The introduction of the remaining time until the upcoming SO in the full description of the condition of the system renders the model inhomogeneous, and for this reason we use techniques that stem from semi-Markov decision processes. Note here that the inclusion of the remaining time until the upcoming SO in the state, although it complicates the analysis, permits us to prove that there is an optimal policy in the class of deterministic stationary policies, cf. Propositions 1 and 3. At each decision epoch (depending on the values of \((i,j,t)\in {\mathcal {S}}\)), we can choose to perform preventive maintenance or do nothing, or in case of a failure to do corrective maintenance (CM), that is \({\mathcal {A}} =\{\text {perform PM, do nothing, perform CM}\}\), where \({\mathcal {A}}\) represents the overall action space.

### Proposition 1

For the model at hand, the deterministic stationary policy is optimal for the average cost criterion.

A formal version of the above proposition, cf. Proposition 3, and its proof can be found in Appendix A, together with a full formal definition of the model in the context of semi-Markov decision processes. In addition to the theoretical validation that the above proposition offers on the existence and nature of the optimal maintenance policy, in the following theorem we compute the optimal policy.

### Theorem 1

Under the assumption that \(c_{\text {pm}}^{\text {so}}< c_{\text {pm}}^{\text {uso}}\) and given the imperfect preventive maintenance probability \(1-p\in (0,1]\), the optimal policy under the average cost criterion is: For state 2, do nothing. For state 1, perform preventive maintenance at scheduled opportunities if \( \mu _1 c_{\text {cm}} > (\mu _1+\mu _2)\frac{c_{\text {pm}}^{\text {so}}}{p}\), and do nothing otherwise, and perform preventive maintenance at unscheduled opportunities for which the residual time until the next scheduled opportunity is in \([{\hat{t}},\tau )\), if \(\mu _1 c_{\text {cm}} > \left( \frac{c_{\text {pm}}^{\text {uso}}}{p} - \frac{c_{\text {pm}}^{\text {uso}} - c_{\text {pm}}^{\text {so}}}{e^{(\mu _1 +\mu _2)\tau }-1} \right) (\mu _1+\mu _2)\), and do nothing otherwise, where \({\hat{t}} = \min \{\tau ,\max \{0,t^*\}\}\), with \(t^*\) satisfying

### Proof

See Appendices B and C. \(\square \)

For USOs, Theorem 1 establishes a control limit policy depending on the remaining time until the next SO: If the residual time until the next SO is smaller than \({\hat{t}}\), then it is optimal to not take the opportunity to perform preventive maintenance in state 1. This is intuitive in the sense that the urgency for preventive maintenance in state 1 at a USO should decrease as the cheaper opportunity at an SO is approaching.

Note that in the special case when preventive maintenance costs at SOs and USOs are equal, the optimal policy reduces to a stationary control limit policy, which is shown in Proposition 2.

### Proposition 2

Under the assumption that \(c_{\text {pm}}^{\text {so}}=c_{\text {pm}}^{\text {uso}}=c_{\text {pm}}>0\) and given the imperfect preventive maintenance probability \(1-p\in (0,1]\), the optimal policy under the average cost criterion is: For state 2, do nothing. For state 1, perform preventive maintenance at both SOs and USOs if \( \mu _1 c_{\text {cm}}> (\mu _1+\mu _2) \frac{c_{\text {pm}}}{p}\), and do nothing otherwise.

### Proof

The proof of this proposition is identical in structure to the proof of Case (i) in the proof of Theorem 1, and for this reason it is omitted. \(\square \)

One could also argue that the cost for preventive maintenance at a USO is actually less than the cost at an SO since there is already a cost attached to the opportunity at hand (for example, service engineers are already at a wind park and they can, at a small extra cost, repair other systems in close proximity as well). In this case, the optimal control policy also reduces to a stationary control limit policy, which is described in Theorem 2.

### Theorem 2

Under the assumption that \(c_{\text {pm}}^{\text {so}}>c_{\text {pm}}^{\text {uso}}\) and given the imperfect preventive maintenance probability \(1-p\in (0,1]\), the optimal policy under the average cost criterion is: For state 2, do nothing. For state 1, perform preventive maintenance at an unscheduled opportunity if \( \mu _1 c_{\text {cm}} > (\mu _1+\mu _2)\frac{ c_{\text {pm}}^{\text {uso}}}{p} \), and do nothing otherwise, and perform preventive maintenance at an SO if \(\mu _1 c_{\text {cm}} > (\mu _1+\mu _2 )\frac{c_{\text {pm}}^{\text {so}}}{p}+ \lambda ({c_{\text {pm}}^{\text {so}}}-c_{\text {pm}}^{\text {uso}})\), and do nothing otherwise.

### Proof

See Appendix D. \(\square \)

### 4.2 Long-run rate of cost per time unit

In the previous section, we characterized the structure of the optimal policy using the average cost criterion. This policy can be viewed as a control limit policy, with the control limit depending on the time until the next SO. In this section, we consider such a policy and we compute the long-run rate of cost per time unit. More concretely, we consider a policy under which in state 2 we do not perform preventive maintenance (i.e., we do nothing), and in state 1 we always perform preventive maintenance at SOs and we perform preventive maintenance at USOs if the remaining time till the next SO is greater than \({\tilde{t}}\), for some given value \({\tilde{t}}\in (0,\tau )\). The results obtained in this section are directly applicable to the results of Sect. 4.1, by setting \({\tilde{t}}=t^*\), cf. Theorem 1.

For the computation of the long-run rate of cost per time unit, we employ the theory of regenerative-like processes, also called stationary-cycle processes, described in Section 2.19 of Serfozo [33]. For this purpose, we consider the inter-regeneration times created by the SOs \(\{\tau , 2\tau , 3\tau , \ldots \}\). For the cost computation, we assume that, at the SOs, the system is in state 1 or 2 according to a stationary probability \(p_1(0)\) and \(p_2(0)\), respectively. The long-run rate of cost per time unit is calculated as the expected total cost incurred between consecutive SOs divided by \(\tau \).

Let \(p_i(t)\) be the probability that the system is in state \(i \in \{1,2\}\) given that the time until the next SO is \(t\in [0,\tau )\). Then the long-run rate of cost per time unit for this control limit policy (depending on the remaining time until the next planned maintenance) for any given time threshold is given in the next theorem.

### Theorem 3

Consider a given policy under which in state 2 we opt to do nothing, and in state 1 we repair at scheduled opportunities and at unscheduled opportunities for which the remaining time until the next scheduled opportunity is greater than \({\tilde{t}}\in (0,\tau )\), and we do nothing otherwise. Under this policy, the long-run rate of cost per time unit equals

with

where the constants \(C_1\) and \(C_2\) are obtained as follows:

### Proof

The expected total cost incurred in one cycle consists of three parts (cf. Eq. (2)), which are related to the expected cost associated with preventive maintenance at SOs, with preventive maintenance at USOs and with corrective maintenance, respectively. It is now sufficient to derive \(p_i(t)\) for \(t\in [0,\tau )\), \(i \in \{1,2\}\).

For \(t\in [{\tilde{t}},\tau )\), the time-dependent behavior of \(p_1(t)\) is governed by

Equation (5) is easily obtained by considering a small time interval of length \({{\,\mathrm{d \!}\,}}t\), and noticing that at time *t* we are in state 1 either due to a transition from state 2 with infinitesimal probability \(\mu _2 {{\,\mathrm{d \!}\,}}t\) or we have remained in state 1 with infinitesimal probability \(1-(\mu _1+\lambda p){{\,\mathrm{d \!}\,}}t\). Subtracting \(p_1(t+{{\,\mathrm{d \!}\,}}t)\) from both sides of Eq. (5), some straightforward computations yield

Dividing this expression by \({{\,\mathrm{d \!}\,}}t\) and letting \({{\,\mathrm{d \!}\,}}t\rightarrow 0\) results in

Following a similar analysis for \(p_2(t)\) yields the following system of differential equations, for \(t\in [{\tilde{t}},\tau )\):

Similarly, for \(t\in [0,{\tilde{t}})\) we have

Solving the system of differential equations (6) and (7) leads to the desired solutions (3) and (4), respectively. In this process, we would need to compute four unknown constants. This is achieved by using: (i) the normalizing condition, i.e., \(p_1(t)+p_2(t)=1\) for all \(t\in [0,\tau )\), (ii) the continuity condition at \({\tilde{t}}\), i.e., \(\lim \limits _{t\rightarrow {\tilde{t}}^-} p_i(t)=p_i({\tilde{t}})\) for \(i\in \{1,2\}\), and (iii) the boundary condition at the SOs imposed by the policy and the imperfect maintenance probability, i.e., \((1-p)p_1(0) = \lim \limits _{t\rightarrow \tau ^-}p_1(t)\). \(\square \)

#### 4.2.1 Special cases

In case of only *scheduled opportunities*, which corresponds to the case \({\tilde{t}}\rightarrow \tau \) or, equivalently, to the case \(\lambda \rightarrow 0\), the probabilities \(p_i(t)\) for \(i \in \{1,2\}\) are derived from the system of linear equations in (7) plus the normalizing condition, i.e., \(p_1(t)+p_2(t)=1\) for all \(t\in [0,\tau )\). This yields

Plugging the above result into Eq. (2), after appropriately considering in Eq. (2) only the costs related to preventive maintenance at SOs and corrective maintenance,

leads to the long-run rate of cost per time unit in the case of only SOs.

In case of *perfect maintenance*, i.e., in the case \(p=1\), the boundary condition at the SOs imposed by the policy and the imperfect maintenance in the proof of Theorem 3 reduces to \(\lim \limits _{t\rightarrow \tau ^-}p_1(t)=0\), as immediately after an SO the system is restored to state 2 with probability 1. This enables us to explicitly solve the system of linear Eqs. (6) and (7), yielding

where

Combining this expression with Eq. (2) results in the long-run rate of cost per time unit in the case of perfect maintenance.

In the case of only *unscheduled opportunities*, which is equivalent to considering \(\tau \rightarrow \infty \), the condition of the system can be fully described using a double descriptor \({\mathcal {S}}=\left\{ (i,j):\ i\in \{1,2\}, \ j\in \{\text {SC},\text {USO}\}\right\} \) which is independent of time, and thus the new model formulation falls into the framework of regular Markov decision processes. It can be easily shown that: For state 2, the optimal policy is to do nothing, and, for state 1, the optimal policy is to repair if \(\frac{(\mu _1+\mu _2)c_{\text {pm}}^{\text {uso}}}{p} < \mu _1 c_{\text {cm}}\) and to do nothing otherwise. Furthermore, under the optimal policy the average long-run rate of cost is equal to

In the case of only *corrective replacements*, the long-run rate of cost is equal to

## 5 Deferring planned maintenance

In this section, we consider that upon a successful maintenance activity (preventive, at an SO or at a USO, or corrective), the upcoming planned maintenance is deferred for a period of length \(\tau \), i.e., at the instances of successful maintenance the remaining time till the next SO is set equal to \(\tau \). We are interested in computing the long-run rate of cost under deferred maintenance and, in Sect. 6.3, using the results of this section and of the previous sections in investigating the economic benefits of deferring planned maintenance.

Analogously to the analysis of Sect. 4.2, we derive the long-run rate of cost using renewal theory; see, for example, [31, Proposition 7.3, page 433]. In this case, we consider the renewal points to be the instances at which there was a successful maintenance activity, i.e., the SOs or USOs at which the preventive maintenance was perfect, or the epochs at which corrective maintenance is performed. Note that the underlying stochastic process that governs the condition of the system regenerates after each successful maintenance activity. That is, after each successful maintenance activity the underlying stochastic process is in state 2 with probability 1. The long-run rate of cost per time unit for a policy in the class of optimal policies is given in the next theorem. As the expressions appearing in the theorem do not simplify upon further computations, we choose to present them in the form of probabilities and expectations associated with the exponential distribution, as these expressions are straightforward (though cumbersome to compute) and shed insight on each of the individual events participating in the final expression, cf. Eq. (8).

### Theorem 4

Consider a given policy under which in state 2 we do nothing, and in state 1 we repair at scheduled opportunities and at unscheduled opportunities for which the remaining time until the next scheduled opportunity is greater than \({\tilde{t}}\in (0,\tau )\), and we do nothing otherwise. Furthermore, consider that planned maintenance is deferred after a successful maintenance. Under this setting, the long-run rate of cost per time unit equals

with

where the density of the truncated exponential random variable *Y* is given by

and for \(0\le y\le \tau \),

where \(\mathbb {1}_{\{x\}}\) is an indicator function taking value 1 if event *x* occurs, and it is zero otherwise, \(T_{\mu _1}\sim \text {Exp}(\mu _1)\), \(T_{\lambda p}\sim \text {Exp}(\lambda p)\), \({\mathbb {P}}\left[ \,\cdot \,\right] ={\mathbb {E}}[\mathbb {1}_{\{\cdot \}}]\) for all events in Eqs. (14)–(16), and \(C\!L{\mathop {=}\limits ^{d}}C\!L'\).

### Proof

See Appendix E. \(\square \)

## 6 Numerical results

Using the results and the analyses of the previous sections, in this section we illustrate through a few well chosen examples the effect of the various parameters in the long-run rate of cost. In these examples, we investigate the financial advantage of the optimal policy, when compared to other (suboptimal) policies. Furthermore, we highlight the financial benefit of perfect maintenance by comparing the long-run rate of cost for the perfect maintenance model (\(p=1\)) to that of the imperfect maintenance model (\(p\in (0,1)\)). Here, we also show the influence of imperfect maintenance on the maintenance planning. In addition, we illustrate the change introduced by the action of deferring planned maintenance after the occurrence of a successful maintenance. To illustrate the financial effects in a realistic context and to connect our analysis with practice, we use values and data stemming from the wind industry.

### 6.1 Comparison of the optimal policy to suboptimal policies

In this section we compute, in the context of the wind industry example, the long-run rate of cost under the optimal policy and we examine how it is affected by varying one by one the parameters \(\tau \), \(\lambda \) and \(c_{\text {pm}}^{\text {uso}}\), while keeping all other parameters fixed. For the determination of the values used in the numerical computations of this section, we consider the gearbox of a wind turbine. Statistics from a recent field study by Ribrant and Bertling [29] on Swedish wind parks in the period 1997–2005 showed that the gearbox is the most critical unit of a wind turbine. The notion of criticality is determined by the fact that a failure of the gearbox leads to the highest downtime when compared to all other wind turbine components, but also by the fact that this component has the highest failure rate among all wind turbine components [29, 34, 35]. Due to its extended downtime after a failure (which is captured in the corresponding maintenance cost), the corrective cost of a gearbox is relatively high compared to preventive maintenance costs; see, for example, Nilsson and Bertling [25]. Based on the values reported in the aforementioned studies, we set \(c_{\text {cm}}=300{,}000\), \(c_{\text {pm}}^{\text {so}}=1000\), \(\mu _2=0.31\), \(\mu _1=0.31\) and \(p=0.6\). In this case, the long-run rate of cost (in euros per year) in the case of only corrective replacements is equal to 46,500. Furthermore, motivated by the wind industry practice, we choose three different values for \(\tau \), that is \(\tau \in \{0.25, 0.5, 1\}\) (years). Next, we consider three different values for \(c_{\text {pm}}^{\text {uso}}\), i.e., \(c_{\text {pm}}^{\text {uso}} \in \{ 2000, 3000, 4000 \}\). Finally, with regard to \(\lambda \), we consider four different values, i.e., \(\lambda \in \{0.5, 1, 2, 4\}\).

In Table 2, we depict the long-run rate of cost for the above-mentioned values under four different policies: The first policy corresponds to replacements only at USOs (\(\pi _{\text {uso}}\)). The second policy corresponds to replacements only at SOs (\(\pi _{\text {so}}\)). The third policy is the optimal policy (\(\pi _{\text {opt}}\)), which is derived in Theorem 1. Note that it is numerically easier to obtain the optimal \({\tilde{t}}\) by minimizing the long-run rate of cost in Theorem 3, instead of from the closed-form expression in Theorem 1, as the latter requires the derivation of a root solution. The fourth policy concerns the optimal policy, but for \(p=1\). This assumption is motivated from practice, as it is oftentimes difficult to exactly determine the value of *p* and it is typically assumed that after a maintenance the component is restored to a perfect state. This policy is denoted by \(\pi '_{\text {opt}}\).

In Table 2, we observe, across all instances, that incorporating planned maintenance can significantly reduce costs compared to only corrective maintenance, which can be reduced even further by adding opportunistic maintenance. Intuitively, due to the cost structure, only planned maintenance at SOs can considerably improve the long-term rate of cost when compared to performing only opportunistic maintenance at USOs. Finally, if we compare \(\pi _{\text {opt}}\) with \(\pi '_{\text {opt}}\) we do not, despite the low value for *p*, observe significant differences. From an operational management perspective, this clearly implies that, if decision makers do not have any knowledge about the value of *p*, and given a similar cost structure as in the gearbox case, assuming perfect maintenance will result in a long-run rate of cost that is close to optimal regardless of the true value of *p*. This will be valid as long as the preventive maintenance cost (at both opportunities) is very small in comparison to the corrective maintenance cost, as is the case of the gearbox costs. As a rule of thumb, one can easily compute the expected number of maintenances (planned or opportunistic) required for a successful preventive maintenance and based on this compute the long-run rate of preventive maintenance cost (approximately of the order \(\max \{c_{\text {pm}}^{\text {so}},c_{\text {pm}}^{\text {uso}}\}/p\)) and compare it with the corrective cost. If the corrective cost is significantly higher, then one may assume that there is no significant difference between \(\pi _{\text {opt}}\) and \(\pi '_{\text {opt}}\), and as a consequence there is no significant difference in the values of the optimal policies under the imperfect and perfect maintenance. In the next section, we investigate the savings that can be obtained by improving the performance of a repair when a decision maker has some knowledge regarding the value of *p*.

### 6.2 Influence of imperfect maintenance

Let \(\pi ^{(p)}_{\text {opt}}\) represent the optimal policy as a function of the successful preventive maintenance probability *p* and let \(C(\pi ^{(p)}_{\text {opt}})\) denote the long-run rate of cost when the policy is \(\pi ^{(p)}_{\text {opt}}\). To demonstrate the effect of *p* in the rate of cost, we compute the relative difference in the cost of not having a perfect preventive maintenance as a function of *p*. This relative difference is denoted by \(\delta (p)\) and is equal to

\(\delta (p)\) indicates how much extra cost is incurred due to imperfect maintenance, and thus shows the benefit of improving the probability of executing a perfect maintenance.

In this numerical example, similarly to before we choose \(\mu _2=0.31\), and \(\mu _1=0.31\). Furthermore, we set \(\lambda = 4\) and \(\tau = 1\). Figure 3 shows \(\delta (p)\) for \(p \in [0.5,1]\) under two different cost structures (denoted by \(\delta (p)^1\) and \(\delta (p)^2\), respectively). Figure 4 depicts the corresponding optimal values for \({\tilde{t}}\) for both cost structures, denoted by \(t^1\) and \(t^2\), respectively. We use the same cost structure as in the previous section, i.e., for \(\delta (p)^1\), we consider \(c_{\text {pm}}^{\text {so}} = 1000, c_{\text {pm}}^{\text {uso}}=2000\) and \(c_{\text {cm}} = 300{,}000\), whereas, for \(\delta (p)^2\), we consider \(c_{\text {pm}}^{\text {so}} = 26{,}500, c_{\text {pm}}^{\text {uso}}=28{,}800\) and \(c_{\text {cm}} = 75{,}500\). The choice for the preventive maintenance cost at SOs and USOs in the second cost structure is common in the lithography industry (see [42]). Based on Fig. 3, we can conclude that, under both cost structures, significant costs can be saved by improving the probability of executing a perfect preventive maintenance (for example, by training).

The optimal policy \(({\tilde{t}})\), denoted by \(t^{1}\) and \(t^{2}\) under the first and second cost structure respectively, is equal to \(t^{1}\approx 0.08\) and \(t^{2}\approx 0.39\) in the case of perfect repairs. In Fig. 4, where we plot \(t^{1}\) and \(t^{2}\) as a function of *p*, we observe the following regarding the influence of *p* on the maintenance planning: If the preventive maintenance cost (at both opportunities) is very small compared to the cost of corrective maintenance, the order of the total preventive maintenance cost incurred until a successful preventive maintenance compared to the corrective maintenance cost is still maintained. Therefore, the maintenance planning does not alter that much regardless of the value of *p*, where the optimal policy is to almost always perform preventive maintenance at USOs for all values of \(p\in [0.5,1]\). This also explains the small discrepancy between \(\pi _{\text {opt}}\) and \(\pi '_{\text {opt}}\) in Table 2. This is different in the case of the second cost structure, where the maintenance planning changes substantially as a function of *p*. Whereas in the perfect case, the optimal policy is to perform preventive maintenance at a USO if the residual time until the next SO is larger than 0.39, for \(p \lessapprox 0.83\), it is optimal to never perform preventive maintenance at a USO. Here, the order of the total preventive maintenance cost incurred until a successful preventive maintenance compared to the corrective maintenance cost is not maintained.

Also, in the opposite cost structure, i.e., \(c_{\text {pm}}^{\text {uso}}<c_{\text {pm}}^{\text {so}}\) (similar examples can be found for \(c_{\text {pm}}^{\text {uso}}=c_{\text {pm}}^{\text {so}}\)), the maintenance planning can be influenced significantly by the imperfect repair probability. For instance, consider the setting with \(\mu _1=1.1, \mu _2 =0.9\), \(c_{\text {pm}}^{\text {so}} = 4500\), \(c_{\text {pm}}^{\text {uso}}=4000\), \(c_{\text {cm}}=10{,}000\), and \(\lambda =0.5\). In case of perfect repairs (i.e., \(p=1\)), the optimal policy is to perform preventive maintenance in state 1 at both SOs and USOs, and to do nothing otherwise (cf. Theorem 2). However, if \(0.72 \lessapprox p \lessapprox 0.83\), the optimal policy is to only perform preventive maintenance at USOs, and if \(p \lessapprox 0.72\), then the optimal policy is to never perform PM. This example illustrates the influence of the imperfect repair probability on the maintenance planning.

### 6.3 Deferring of planned maintenance

In this section, we illustrate the change introduced by the action of deferring planned maintenance after the occurrence of a successful maintenance in three numerical examples that relate to the wind industry, the lithography industry, and to an artificially created example.

Figure 5 shows the long-run rate of cost for both the deferral and no deferral case for the example with data stemming from the wind industry. Again, with regard to the cost parameters, we used \(c_{\text {pm}}^{\text {so}} = 1000, c_{\text {pm}}^{\text {uso}}=2000\) and \(c_{\text {cm}} =\) 300,000. With regard to the other parameters, we set \(\lambda = 4\), \(\tau = 1\), \(\mu _1 = 0.31\), \(\mu _2 = 0.31\) and \(p=0.6\). We can observe that deferring the planned maintenance both significantly increases the long-run rate of cost under the optimal policy (an increase of 28.14% from 8468.87 to 10852.15) and changes the value connected to the optimal policy, \({\tilde{t}}\) from 0.112 to 0.

Figure 6a, b depicts the long-run rate of cost for the deferral and the no deferral case, respectively, based on the values of the lithography industry example. We use the same cost parameters as in Sect. 6.2, that is \(c_{\text {pm}}^{\text {so}} = 26{,}500, c_{\text {pm}}^{\text {uso}}=\) 28,800 and \(c_{\text {cm}} = 75{,}500\). The other parameters remain unchanged, i.e., \(\lambda = 4\), \(\tau = 1\), \(\mu _1 = 0.31\), \(\mu _2 = 0.31\) and \(p=0.6\). Again, we observe the same influence of deferring the planned maintenance on both the long-run rate of cost under the optimal policy (an increase of 6533.3 % from 12,840.12 to 851,727.53) and on the value of \({\tilde{t}}\) associated with the optimal policy (from 1 to 0.175), similarly to the numerical example for the wind industry. The drastic increase is due to the cost structure, and more explicitly, it is due to the preventive maintenance costs values (both at scheduled and unscheduled opportunities), which are relatively much closer to the corrective maintenance cost in comparison to the wind industry example.

To illustrate that the opposite effect (albeit to a much lesser degree than in the previous two examples) can also hold, we create an artificial example where we set \(c_{\text {pm}}^{\text {so}} = 5000, c_{\text {pm}}^{\text {uso}}=10{,}000\) and \(c_{\text {cm}} = 19{,}000\), and \(\lambda = 4\),\(\tau = 4\), \(\mu _1 = 1\), \(\mu _2 = 0.4\) and \(p=0.5\). Figure 7 depicts the long-run rate of cost for both the deferral and the no deferral case for this example. Here, we observe that, for all values of \({\tilde{t}}\), cost savings can be obtained by deferring planned maintenance after the occurrence of a successful opportunistic maintenance. More specifically, whereas the optimal value of \({\tilde{t}}\) is equal to 1 for both cases, the long-run rate of cost under the optimal policy decreases with 0.88% from 6458.97 to 6402.44, when deferring planned maintenance.

## 7 Conclusion

In this paper, we considered the maintenance policy for a three-state component degrading over time with corrective replacements at failures and preventive replacements at both scheduled and unscheduled opportunities under imperfect repair. By formulating this problem as a semi-Markov decision process, we were able to characterize the structure of the optimal maintenance policy as a control limit policy, where the control limit depends on the time until the next planned maintenance opportunity. Using this approach, a closed-form expression for the optimal control limit was derived. Within this class of control limit policies, we derived, using the theory of regenerative processes, an explicit expression for the long-run rate of cost. Using a similar approach based on renewal theory, we derived an expression for the long-run rate of cost in the case when planned maintenance is deferred after the occurrence of a successful opportunistic maintenance.

A cost comparison with other suboptimal policies has been examined, which illustrated the benefits of optimizing the maintenance policy. Specifically, it was found that incorporating planned maintenance can significantly reduce costs compared to only corrective maintenance, which can be reduced even further by adding opportunistic maintenance. Moreover, numerical results indicate that the extent of the impact of the perfect repair probability on the optimal policy depends on the underlying cost structure. It was also shown that substantial cost savings can be obtained by improving the perfect repair probability. Finally, our numerical examples indicate that the deferral of planned maintenance after the occurrence of a successful opportunistic maintenance may impact the total cost in both a negative and positive way.

There are a number of extensions and topics for future research. The most important direction is to consider the network dependency on the level of the structural degradation and failure dependencies, i.e., to consider a multi-dimensional process that captures the degradation of the various assets in the network. Such a future direction would be particularly interesting in the case of a small number of assets for which the Poisson approximation for the opportunistic maintenance may not be accurate. In addition, another very interesting research direction would be to consider a more general model in which the condition of the system degrades through \(N>2\) states. Next, in this analysis we have assumed that the condition of the system is fully observable. However, in many real applications, condition monitoring data such as spectrometric oil data or vibration data give only partial information about the underlying state of the system. From this perspective, it would be interesting to extend the model at hand to a partially observable model in which the condition monitoring data are stochastically related to the true system state. Finally, the results in this paper are valid for systems with hypo-exponentially distributed lifetimes. Future research could relax this assumption by considering a phase-type lifetime distribution.

## References

Arts, J., Basten, R.: Design of multi-component periodic maintenance programs with single-component models. IISE Trans.

**50**(7), 606–615 (2018)Baker, R.D., Christer, A.H.: Review of delay-time OR modelling of engineering aspects of maintenance. Eur. J. Oper. Res.

**73**(3), 407–422 (1994)Bhattacharya, R.N., Majumdar, M.: Controlled semi-Markov models under long-run average rewards. J. Stat. Plan. Inference

**22**(2), 223–242 (1989)Bevilacqua, M., Braglia, M.: The analytic hierarchy process applied to maintenance strategy selection. Reliab. Eng. Syst. Saf.

**70**(1), 71–83 (2000)Christer, A., Waller, W.: Delay time models of industrial inspection maintenance problems. J. Oper. Res. Soc.

**35**(5), 401–406 (1984)Christer, A.H.: Modelling inspection policies for building maintenance. J. Oper. Res. Soc.

**33**(8), 723–732 (1982)Christer, A.H.: Developments in delay time analysis for modelling plant maintenance. J. Oper. Res. Soc.

**50**(11), 1120–1137 (1999)Feinberg, E.A.: Constrained semi-Markov decision processes with average rewards. Math. Methods Oper. Res.

**39**, 257–288 (1994)Hernández-Lerma, O., Lasserre, J.B.: Further topics on discrete-time Markov control processes, vol. 42. Springer, Berlin (2012)

Jardine, A.K.S., Lin, D., Banjevic, D.: A review on machinery diagnostics and prognostics implementing condition-based maintenance. Mech. Syst. Signal Process.

**20**(7), 1483–1510 (2006)Jardine, A.K.S., Tsang, A.H.C.: Maintenance, replacement, and reliability: theory and applications. CRC Press, Boca Raton (2005)

Jaśkiewicz, A.: An approximation approach to ergodic semi-Markov control processes. Math. Oper. Res.

**54**(1), 1–19 (2001)Jaśkiewicz, A.: On the equivalence of two expected average cost criteria for semi-Markov control processes. Math. Oper. Res.

**29**(2), 326–338 (2004)Kalosi, S., Kapodistria, S., Resing, J. A. C.: Condition-based maintenance at both scheduled and unscheduled opportunities. In: Scarf, P., Wu, S., Do, P. (eds), Proceedings of the 9th IMA International Conference on Modelling in Industrial Maintenance and Reliability (2016), ISBN: 978-0-905091-31-0. arXiv:1607.02299

Khinchin, A.Y.: Sequences of chance events without after-effects. Theory Probab. Appl.

**1**(1), 1–15 (1956)Kim, M.J., Jiang, R., Makis, V., Lee, C.G.: Optimal Bayesian fault prediction scheme for a partially observable system subject to random failure. Eur. J. Oper. Res.

**214**(2), 331–339 (2011)Kim, M.J., Makis, V.: Joint optimization of sampling and control of partially observable failing systems. Oper. Res.

**61**(3), 777–790 (2013)Lam, J.Y.J., Banjevic, D.: A myopic policy for optimal inspection scheduling for condition based maintenance. Reliab. Eng. Syst. Saf.

**144**, 1–11 (2015)Lippman, S.A.: On dynamic programming with unbounded rewards. Manag. Sci.

**21**(11), 1225–1233 (1975)Maillart, L.M., Pollock, S.M.: Cost-optimal condition-monitoring for predictive maintenance of 2-phase systems. IEEE Trans. Reliab.

**51**(3), 322–330 (2002)Makis, V., Wu, J., Gao, Y.: An application of DPCA to oil data for CBM modeling. Eur. J. Oper. Res.

**174**(1), 112–123 (2006)Mobley, R.K.: An Introduction to Predictive Maintenance. Elsevier, Amsterdam (2002)

Nakagawa, T.: Optimum policies when preventive maintenance is imperfect. IEEE Trans. Reliab.

**28**(4), 331–332 (1979a)Nakagawa, T.: Imperfect preventive-maintenance. IEEE Trans. Reliab.

**28**(5), 402–402 (1979b)Nilsson, J., Bertling, L.: Maintenance management of wind power systems using condition monitoring systems—life cycle cost analysis for two case studies. IEEE Trans. Energy Convers.

**22**(1), 223–229 (2007)Peng, Y., Dong, M., Zuo, M.J.: Current status of machine prognostics in condition-based maintenance: a review. Int. J. Adv. Manuf. Technol.

**50**(1–4), 297–313 (2010)Pham, H., Wang, H.: Imperfect maintenance. Eur. J. Oper. Res.

**94**(3), 425–438 (1996)Prajapati, A., Bechtel, J., Ganesan, S.: Condition based maintenance: a survey. J. Qual. Maint. Eng.

**18**(4), 384–400 (2012)Ribrant, J., Bertling, L.: Survey of failures in wind power systems with focus on Swedish wind power plants during 1997–2005. IEEE Trans. Energy Convers.

**22**(1), 167–173 (2007)Ross, S.M.: Average cost semi-Markov decision processes. J. Appl. Probab.

**7**(3), 649–656 (1970)Ross, S.M.: Introduction to Probability Models. Academic Press, Cambridge (2014)

Schäl, M.: On the second optimality equation for semi-Markov decision models. Math. Oper. Res.

**17**(2), 470–486 (1992)Serfozo, R.: Basics of Applied Stochastic Processes. Springer, Berlin (2009). 2nd printing, 2012 edition

Spinato, F., Tavner, P.J., Van Bussel, G.J.W., Koutoulakos, E.: Reliability of wind turbine subassemblies. IET Renew. Power Gener.

**3**(4), 387–401 (2009)Tavner, P.J., Xiang, J., Spinato, F.: Reliability analysis for wind turbines. Wind Energy Int. J. Prog. Appl. Wind Power Convers. Technol.

**10**(1), 1–18 (2007)Van Oosterom, C., Elwany, A., Çelebi, D., Van Houtum, G.J.: Optimal policies for a delay time model with postponed replacement. Eur. J. Oper. Res.

**232**(1), 186–197 (2014)Vega-Amava, O., Luque-Vásquez, F.: Sample-path average cost optimality for semi-Markov control processes on Borel spaces: unbounded costs and mean holding times. Appl. Math.

**27**(3), 343–367 (2000)Wang, W.: Delay time modelling. In: Kobbacy, K.A.H. & Murthy, D.N.P. (eds.) Complex System Maintenance Handbook, pp. 345–370. Springer, Berlin (2008)

Wang, W.: An overview of the recent advances in delay-time-based maintenance modelling. Reliab. Eng. Syst. Saf.

**106**, 165–178 (2012)Yang, M., Makis, V.: ARX model-based gearbox fault detection and localization under varying load conditions. J. Sound Vib.

**329**(24), 5209–5221 (2010)Yushkevich, A.A.: On semi-Markov controlled models with an average reward criterion. Theory Probab. Appl.

**26**(4), 796–803 (1982)Zhu, Q., Peng, H., Timmermans, B., Van Houtum, G.J.: A condition-based maintenance model for a single component in a system with scheduled and unscheduled downs. Int. J. Prod. Econ.

**193**, 365–380 (2017)Zhu, Q., Peng, H., Van Houtum, G.J.: An age-based maintenance policy using the opportunities of scheduled and unscheduled system downs. Beta report, Eindhoven University of Technology (2016)

Zio, E., Compare, M.: Evaluating maintenance policies by quantitative modeling and analysis. Reliab. Eng. Syst. Saf.

**109**, 53–65 (2013)

## Acknowledgements

The authors gratefully acknowledge the contribution of S. Kalosi in the early stages of the preparation of the work. The authors would like to thank M. Barbieri, J. Korst, and V. Pronk (all Philips Research), and O. J. Boxma and G. J. van Houtum (both Eindhoven University of Technology) for their time and advice in the preparation of this work. The work of C. Drent is supported by the Data Science Flagship framework, a cooperation between the Eindhoven University of Technology and Philips. The work of S. Kapodistria is supported by the NWO Gravitation Project ‘NETWORKS’ of the Dutch government.

## Author information

### Authors and Affiliations

### Corresponding author

## Additional information

### Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Appendices

### Optimality equations for semi-Markov decision process

We consider the so-called ratio-average cost for a controlled semi-Markov decision process, which corresponds to the limit superior of the expected total cost over a finite number of jumps divided by the expected cumulative time of these jumps; see Ross [30], Feinberg [8] and Schäl [32], for instance.

We shall use here the definition of a controlled semi-Markov decision process from Lippman [19], Yushkevich [41] and Jaśkiewicz [13]. A controlled semi-Markov decision process is specified by five objects: a Borel state space \({\mathcal {S}}\), a Borel action space \({\mathcal {A}}\), a law of motion—a measurable projection determining the state as a function of an action, a transition function (transition law) \({\mathcal {P}}\)—a probability measure depending measurably on the state and the action, and a reward (or cost) function *c*.

The process is observed at time \(t = 0\) to be in some state \(x_0\in {\mathcal {S}}\). At that time, an action \(a_0\in {\mathcal {A}}_{x_0}\) is chosen, where \({\mathcal {A}}_{x_0}\) is a compact set of actions available in state \(x_0\). The set of all actions is \({\mathcal {A}}\) and is also assumed to be a Borel state space.

For the problem at hand, the state space is

and the action space is \({\mathcal {A}} =\{\text {perform PM, do nothing, perform CM}\}\), cf. Sect. 4.1.

If the current state is \(x_0\) and action \(a_0\) is selected, then the immediate cost \(c(x_0;a_0)\) is incurred, and the system remains in state \(x_0 \) for a random time \(t_1\), with the cumulative distribution depending only on \(x_0\) and \(a_0\). Afterward, the system jumps to the state \(x_1 \) according to the probability measure (transition law) \({\mathcal {P}}( \cdot \,|\, x_0, a_0,t_1)\). This procedure yields a trajectory \((x_0,a_0,t_1,x_1,a_1,t_2,\ldots )\) of some stochastic process, where \(x_n\) is the state, \(a_n\) is the control variable and \(t_n\) is the time of the *n*-th transition, \(n=0,1,\ldots \). In the sequel, we shall refer to the corresponding random variables by means of their capital letters: \(T_n\)—the random time of the *n*-th transition, for \(n=1,2,\ldots ,\) with \(T_0:=0\), \(X_n\)—the state at time \(T_n\), and \(A_n\)—the action at time \(T_n\).

Let \(H_n\) be the space of admissible histories up to the *n*-th transition, \(H_n:=({\mathcal {S}}\times {\mathcal {A}}\times [0,\infty ))^n\times {\mathcal {S}}\) and \(H_0:={\mathcal {S}}\). An element \(h_n\) of \(H_n\) is called a partial history of the process and is of the form \(h_n=(x_0,a_0, t_1, \ldots , x_{n-1},a_{n-1}, t_n, x_n)\). A control policy (or policy) is a sequence \(\{\pi _n\}\), where each \(\pi _n\) is a conditional probability \(\pi _n(\cdot \,|\, h_n)\) on the control set \({\mathcal {A}}_{x_n}\), given the entire history \(h_n\), such that \(\pi _n({\mathcal {A}}_{x_n}\,|\, h_n)=1\) for all \(h_n\), \(n=1,2,\ldots \). The class of all policies is denoted by \(\Pi \), and let \(\Pi _{DS}\) denote the class of all deterministic stationary policies.

For each initial state \(x_0\in {\mathcal {S}}\) and for each policy \(\pi \in \Pi \), there exists a unique probability measure \({\mathbb {P}}^\pi _{x_0}\) such that

Further, let \(\tau (x,a)\) denote the conditional mean sojourn (holding) time spent in state *x* under action *a*, i.e.,

and let \({\tilde{F}}_{x}^a(\alpha )\) denote the Laplace–Stieltjes transform of the sojourn time spent in state *x* under action *a*, i.e.,

For the problem at hand, the cost function is defined as follows:

Let \({\mathcal {P}}_{x_n}^{a_n}(s,x_{n+1})\) denote the joint density/mass of the transition time \(T_{n+1}-T_n\) and the allowed next state \(X_{n+1}\), given the current state \(X_n=x_n\) and the allowed action \(a_n\). For \(x_n=(1,\text {SC},t)\), \(t\in (0,\tau )\), and \(a_n=\{\text {perform CM}\}\),

For the derivation of the above probabilities, it suffices to note that there are three possible evolutions in terms of the state of the system: either an SO or an SC or a USO, where the time till the SO is equal to *t*, while the times till the next SC and the USO are exponentially distributed with rates \(\mu _2\) and \(\lambda \), respectively. The probabilities for \(x_n=(2,\text {USO},t)\) and \(a_n=\{\text {do nothing}\}\) or \(a_n=\{\text {perform PM}\}\) are identical. The remaining probabilities are obtained using very similar arguments.

From the joint distributions, the marginal cumulative of the transition time \(T_{n+1}-T_n\) can be immediately derived as follows, for \(x_n =(1,\text {SC},t)\) and \(a_n=\{\text {perform CM}\}\):

The distribution of the transition time from state \(x_n=(2,\text {USO},t)\) under actions \(a_n=\{\text {do nothing}\}\) or \(a_n=\{\text {perform PM}\}\) are identical. The rest of the marginal distributions for the other states and actions follow analogously.

Having fully defined the probabilities for the problem at hand, we proceed in providing, following the proofs in Bhattacharya and Majumdar [3], the proposition below that guarantees that (1) a dynamic programming equation holds for the optimal reward (this equation is typically referred to as the average optimality equality or as the Bellman equation), and (2) a deterministic stationary policy (optimal for long-run average reward) is provided by this equation.

### Proposition 3

For the model at hand, there exist a bounded function \(V(\cdot )\) and a constant *g* such that

Moreover, the deterministic stationary policy \(\pi ^{*(\infty )}\in \Pi _{DS}\) is optimal for the ratio-average cost criterion with

where

### Proof

The proof of the proposition relies on the fact that the costs *c*(*x*; *a*) are nonnegative and upper bounded by \(c_{\text {cm}}\). We follow here the ideas presented in Bhattacharya and Majumdar [3] and in Theorems 10.3.1 & 10.3.6 in [9, Sections 10.4 and 10.5]. Following the ideas therein, we consider the corresponding \(\alpha \)-discounted cost criterion

and \(V_\alpha (x)=\inf _{\pi \in \Pi }V_\alpha (x,\pi )\). The main steps in the proof of the proposition are

**Step 1:**:Show that the optimal reward \(V_\alpha (x)\) under discounting is continuous and bounded. The latter follows easily by noting that \(V_\alpha (x)\) is bounded by \(V_\alpha (\pi _{\text {DN}},x)\), where \(\pi _{\text {DN}}\) denotes the policy of doing nothing at all opportunities, unless the component fails, in which case it is mandatory to do corrective maintenance. This yields

$$\begin{aligned} V_\alpha (x)\le c_{\text {cm}}\frac{\frac{\mu _1}{\mu _1+\alpha }}{1-\frac{\mu _1}{\mu _1+\alpha }\frac{\mu _2}{\mu _2+\alpha }},\ \forall x\in {\mathcal {S}}. \end{aligned}$$Analogously,

$$\begin{aligned} g\equiv J^*(x)\le c_{\text {cm}}\frac{\mu _1\mu _2}{\mu _2+\mu _1}. \end{aligned}$$See Appendix A.1 for further details.

**Step 2:**:Show that the discounted Bellman equation

$$\begin{aligned} V_\alpha (x)&=\min _{a\in {\mathcal {A}}_x} \Big \{ c(x;a)+\int _{s} \int _{y} e^{-\alpha s}V(y){\mathcal {P}}^a_x (\mathrm {d}y\,|\,s)\,\mathrm {d}F_x^a(s)\Big \}, x\in {\mathcal {S}}, \end{aligned}$$(25)holds. Also, there exists a Borel measurable function that minimizes the right-hand side of the discounted Bellman equation for every \(x\in {\mathcal {S}}\). The deterministic stationary policy is optimal under discounting. The proof follows verbatim the steps in [3, Theorem 3.1 on page 227].

**Step 3:**:Choose \(z\in {\mathcal {S}}\). Then, for all \(x\in {\mathcal {S}}\), show that \(|V_\alpha (x)-V_\alpha (z)|\) is bounded for all \(\alpha >0\). This follows oftentimes by the geometric ergodicity of the underlying Markov controlled model. In the case under consideration, this is proven by noting that from all states \(x=(i,j,t) \in {\mathcal {S}}\), after time

*t*the system is in an SO state with probability 1. This yields$$\begin{aligned} |V_\alpha (x)-V_\alpha (z)|&\le c_{\text {cm}}\left( 2+\left( \lambda +\mu _1+\mu _2+1+\frac{\mu _1\mu _2}{\mu _1+\mu _2}\right) (\tau _{x,1}+\tau _{z,1})\right) , \end{aligned}$$with \(\tau _{x,1}<\infty \) denoting the expectation of the first passage time from state \(x\in {\mathcal {S}}\) to state \((1,\text {SO},0)\). See Appendix A.2 for further details. A consequence of the above finding is that, for all deterministic stationary policies \(\pi \in \Pi _{DS}\), the expected average cost in (24) is independent of

*x*.**Step 4:**:Show that there exists a solution, say

*g*, to the average optimality equality (23). There exists a Borel measurable function \(\pi ^ *\) on \({\mathcal {S}}\) into \({\mathcal {A}}\) such that the maximum on the right-hand side of (23) is attained at \(\pi ^*(x)\), \(x\in {\mathcal {S}}\). The proof follows verbatim the steps in [3, Theorem 3.2 (a) & (b) on page 228].**Step 5:**:Show that the stationary policy \(\pi ^{*(\infty )}\) is optimal for the long-run average reward and

*g*is the optimal reward, with \(g=\limsup _{\alpha \rightarrow 0^+}\alpha V_\alpha (x)\). See Appendix A.3 for further details. \(\square \)

Equivalent propositions (based on different methods, but more importantly based on different assumptions regarding the geometric ergodicity) can be found for example in Jaśkiewicz [12], Vega-Amava and Luque-Vásquez [37] and Jaśkiewicz [13].

### 1.1 Proof of step 1

Under the policy of doing nothing at all opportunities, unless the component fails in which case it is mandatory to do corrective maintenance, say \(\pi _{\text {DN}}\), \(V_\alpha (\pi _{\text {DN}},x)\) can be computed using first step analysis. Note that under this policy, it is not required to keep track of the remaining time to the next SO opportunity. Say \(x=(i,j,\cdot )\). If \(i=1\), then after an exponentially distributed time with rate \(\mu _1\), say \(T_{\mu _1}\), the component will fail and a cost \(c_{\text {cm}}\) will be incurred. If \(i=2\), then after a Hypo-exponentially distributed time with rates \((\mu _2,\mu _1)\), say \(T_{\mu _1}+T_{\mu _2}\) (the two random times are independent), the component will fail and a cost \(c_{\text {cm}}\) will be incurred. All in all,

Similarly,

which yields, upon solving for \(V_\alpha ((1,\text {SC},\cdot ),\pi _{\text {DN}})\) and substituting that \({\mathbb {E}}[e^{-\alpha T_{\mu _i}}]=\frac{\mu _i}{\mu _i+\alpha }\), \(i=1,2\),

Combining the last equation with (26) yields

The proof for the long-run average cost follows by employing a simple renewal argument.

### 1.2 Proof of step 3

Choose \(x=(i,j,t) \in {\mathcal {S}}\). Let \(T_{x,1}\) denote the first passage time from state *x* to state \((1,\text {SO},0)\), and \({\tilde{F}}_{x,1}(\alpha )={\mathbb {E}}[e^{-\alpha T_{x,1}}]\) and \(\tau _{x,1}={\mathbb {E}}[T_{x,1}]\).

Starting from state *x*, after time \(t\in [0,\tau )\), the system is in an SO state with probability 1. More concretely, under the optimal policy (which is deterministic stationary), say \(\pi ^{(\infty )}_\alpha \), starting in state \(x=(i,j,t)\), it will end up in state \((1,\text {SO},0)\) after time *t* with probability \(p_x\), and in state \((2,\text {SO},0)\) with probability \(1-p_x\). In case state *x* coincides with an SO state then \(p_x=0\) or \(p_x=1\). Once in an SO state, the system state observed at only the SO epochs behaves like a discrete time (irreducible and aperiodic) Markov chain with only states \((1,\text {SO},0)\) and \((2,\text {SO},0)\). Thus, \(\tau _{x,1}={\mathbb {E}}[T_{x,1}]=\lim _{\alpha \rightarrow 0^+}\frac{1-{\tilde{F}}_{x,1}(\alpha )}{\alpha }<\infty \).

From the above,

Note that \({\mathbb {E}}_{x}^{\pi ^{(\infty )}_\alpha }[\alpha \text {-cost from }x \text { to state }(1,\text {SO},0)\text { in }T_{x,1}]\) is equal to: (1) the expected discounted cost incurred directly in state *x*, which is upper bounded by \(c_{\text {cm}}\), (2) the total expected discounted cost of all the SOs that occur in time \(T_{x,1}\), which is upper bounded by \(c_{\text {cm}} \tau _{x,1}\), (3) the total expected discounted cost of all the USOs that occur in time \(T_{x,1}\), which is upper bounded by \(c_{\text {cm}}\lambda \tau _{x,1}\), and (4) the total expected discounted cost of all the corrective maintenance opportunities that occur in time \(T_{x,1}\), which is upper bounded by \(c_{\text {cm}}(\mu _1+\mu _2) \tau _{x,1}\). All in all,

Then, straightforward computations yield

Similarly, for \(z=(i',j',t')\in {\mathcal {S}}\),

Then,

Combining the above with Step 1 yields

Lastly, note that \( (1-{\mathbb {E}}_{x}^{\pi _\alpha ^{(\infty )}}[e^{-\alpha T_{x,1}}]+1-{\mathbb {E}}_{z}^{\pi _\alpha ^{(\infty )}}[e^{-\alpha T_{z,1}}] )\frac{\frac{\mu _1}{\mu _1+\alpha }}{1-\frac{\mu _1}{\mu _1+\alpha }\frac{\mu _2}{\mu _2+\alpha }}\le (\tau _{x,1}+\tau _{z,1})\frac{\mu _1\mu _2}{\mu _1+\mu _2}\), which yields

### 1.3 Proof of step 5

To prove this step, we follow to a large extent the approach in [3, Theorem 3.2 (c)–(d)]. Consider the average optimality equality (23). This yields, for an arbitrary policy \(\pi \),

which can be equivalently written as

Taking expectations on both sides, one gets

Summing both sides of the above equation over \(k = 0, 1,\ldots , N - 1\), and dividing by \({\mathbb {E}}_x^\pi \left[ \sum _{k=0}^{N-1}\tau (X_{k};a_{k})\right] \) one has

Note that as \(N\rightarrow \infty \), for \(x=(i,j,t)\):

As such, \({\mathbb {E}}_x^\pi \left[ \sum _{k=0}^{N-1}\tau (X_{k};a_{k})\right] \) is bounded from below for large values of *N*. Taking \(\limsup \limits _{N\rightarrow \infty }\) on both sides of Eq. (27) yields \(J(x,\pi )\ge g\).

Since, for \(\pi ^{*(\infty )}\), the above analysis holds with an equality, it is evident that \(J(x,\pi ^{*(\infty )})= g\). Note that *g* is an arbitrary limit point of \(\alpha V_\alpha (x)\) as \(\alpha \rightarrow 0^+\). Furthermore, since \(\alpha |V_{\alpha }(x)-V_{\alpha }(z)|\rightarrow 0\) as \(\alpha \rightarrow 0^+\) for all *x* and for all *z*, it is now evident that \(g=\limsup _{\alpha \rightarrow 0^+}\alpha V_\alpha (x)\) for all \(x\in {\mathcal {S}}\).

### Average cost equalities—Bellman equations

We proceed writing down the average cost equalities for the model at hand, cf. Proposition 3. More concretely, for \(t\in [0,\tau )\), let *V*(*i*, *j*, *t*) be the value function when the state of the system is \((i,j,t)\in {\mathcal {S}}\). The average optimality equations read as follows:

In this paragraph, we explain in detail how Eq. (28) is obtained. State \((2,\text {SC},t)\) is associated with only the decision ‘do nothing.’ Therefore, there is no minimum operator appearing on the right hand side of Eq. (28) and the corresponding cost is equal to zero. For the other terms appearing on the right hand side of Eq. (28), it suffices to note that there are three possible evolutions in terms of the state of the system: either an SO or an SC or a USO, where the time till the next SO is equal to *t*, while the times till the SC and USO are exponentially distributed with rates \(\mu _1\) and \(\lambda \), respectively. In particular, the expected sojourn time of the semi-Markov decision process in state \((2,\text {SC},t)\) can be calculated as the expectation of the minimum of a deterministic time *t* and two exponentially distributed times, which can be easily verified to be equal to \(\int _{0}^{t} e^{-(\mu _1+\lambda )x}{{\,\mathrm{d \!}\,}}x\). The set of optimality equations for the remaining states can be obtained using very similar arguments. Note that in Eqs. (30)–(33), inside the minimum, the left term corresponds to the action ‘perform preventive maintenance,’ while the right terms correspond to the action ‘do nothing.’

We observe that, since \(c_{\text {pm}}^{\text {so}}, c_{\text {pm}}^{\text {uso}}>0\) and \(p+q=1\), Eqs. (30) and (31) yield that it is never optimal to perform preventive maintenance in state 2 in both USOs and SOs, respectively.

We define the following auxiliary functions, for \(t\in [0,\tau )\):

so that Eqs. (28)–(33) reduce to

### Proof of Theorem 1

### Proof

We distinguish four cases, each corresponding to a different set of actions. Case (i): \(F_1(\tau )-F_2(\tau )\le \frac{c_{\text {pm}}^{\text {so}}}{p}\); Case (ii): \(\frac{c_{\text {pm}}^{\text {so}}}{p}< F_1(\tau )-F_2(\tau )< \frac{c_{\text {pm}}^{\text {uso}}}{p}\); Case (iii): \(\frac{c_{\text {pm}}^{\text {uso}}}{p} < F_1(\tau )-F_2(\tau )\); Case (iv): \(F_1(\tau )-F_2(\tau )= \frac{c_{\text {pm}}^{\text {uso}}}{p}\).

**Case (i):**:In state \((2,\text {SO},0)\), it is optimal to not perform preventive maintenance. Furthermore, from the assumption

$$\begin{aligned} F_1(\tau )-F_2(\tau )&\le \frac{c_{\text {pm}}^{\text {so}}}{p} \end{aligned}$$(38)and Eq. (37) for \(i=1\), it becomes evident that it is also optimal to not perform preventive maintenance in state \((1,\text {SO},0)\). Since the function \(F_1(t)-F_2(t)\) is, by definition, a continuous function in \(t\in [0,\tau ]\), \(c_{\text {pm}}^{\text {so}}<c_{\text {pm}}^{\text {uso}}\), and taking into account Eq. (38), it is evident that there exists an \(\varepsilon >0\) such that

$$\begin{aligned} F_1(t)-F_2(t) \le \frac{c_{\text {pm}}^{\text {uso}}}{p}, \text { for all }t\in (\tau -\epsilon ,\tau ]. \end{aligned}$$(39)Equation (39), in light of Eq. (36), implies that if the elapsed time from the SO is less than \(\epsilon \), then, under the assumption it is optimal to not perform preventive maintenance on the system in state \((i,\text {SO},0)\), it is also not optimal to perform preventive maintenance at a USO. In this case, for \(t\in (\tau -\epsilon ,\tau ]\), we have that \(V(1,\text {USO},t)=F_1(t)\) and \(V(2,\text {USO},t) = F_2(t)\), cf. Eq. (36). Taking the derivative with respect to

*t*in Eq. (34) and substituting the above-obtained values for \(V(1,\text {USO},t)\) and \(V(2,\text {USO},t)\) yields$$\begin{aligned} F_1'(t)-F_2'(t)&=-\, (\mu _1+\lambda )F_1(t)+\mu _1 V(1,\text {SC},t)+\lambda V(1,\text {USO},t)\nonumber \\&\quad +\, (\mu _2+\lambda )F_2(t)-\mu _2 V(2,\text {SC},t)-\lambda V(2,\text {USO},t)\nonumber \\&= -\, (\mu _1+\mu _2)F_1(t) + (\mu _1+\mu _2)F_2(t) + \mu _1 c_{\text {cm}},\ t\in (\tau -\epsilon ,\tau ]. \end{aligned}$$The solution to the above differential equation reads

$$\begin{aligned}&F_1(t)-F_2(t) = \frac{\mu _1c_{\text {cm}}}{\mu _1+\mu _2} + \left( F_1(\tau )-F_2(\tau ) - \frac{\mu _1 c_{\text {cm}}}{\mu _1 + \mu _2} \right) e^{(\mu _1+\mu _2)(\tau -t)}, \nonumber \\&\quad t\in (\tau -\epsilon ,\tau ]. \end{aligned}$$(40)If \(F_1(\tau )-F_2(\tau ) - \frac{\mu _1 c_{\text {cm}}}{\mu _1 + \mu _2}\ne 0\), it follows that, for \(t\in (\tau -\epsilon ,\tau ]\), the function \(F_1(t)-F_2(t)\) is strictly monotone. In this case, by extending the previous analysis to the entire domain, which would maintain the strict monotonicity of the function \(F_1(t)-F_2(t)\), we would reach a contradiction: For \(t=0\), Eq. (34) yields \(F_1(0)=V(1,\text {SO},0){\mathop {=}\limits ^{(37)}}F_1(\tau )\) and \(F_2(0)=V(2,\text {SO},0){\mathop {=}\limits ^{(37)}}F_2(\tau )\), where \({\mathop {=}\limits ^{(\cdot )}}\) denotes that the equality follows from Eq. (\(\cdot \)). We thus have

$$\begin{aligned} F_1(0)-F_2(0) = F_1(\tau ) -F_2(\tau ). \end{aligned}$$(41)Due to (41), it is evident that \(F_1(\tau )-F_2(\tau ) - \frac{\mu _1 c_{\text {cm}}}{\mu _1 + \mu _2}= 0\), thus the function \(F_1(t)-F_2(t)\) satisfying Eq. (40) is a constant function, i.e.,

$$\begin{aligned} F_1(t)-F_2(t) = \frac{\mu _1c_{\text {cm}}}{\mu _1+\mu _2},\ t\in (0,\tau ]. \end{aligned}$$(42)Combining Eq. (38) with Eq. (42) leads to the optimality condition for Case (i). That is, if

$$\begin{aligned} \mu _1c_{\text {cm}}\le (\mu _1+\mu _2)\frac{c_{\text {pm}}^{\text {so}}}{p}, \end{aligned}$$we do not perform preventive maintenance at any opportunity.

**Case (ii):**:In state \((2,\text {SO},0)\), similarly to the previous case, it is optimal to not perform preventive maintenance. However, from the assumption

$$\begin{aligned} \frac{c_{\text {pm}}^{\text {so}}}{p}< F_1(\tau )-F_2(\tau ) < \frac{c_{\text {pm}}^{\text {uso}}}{p} \end{aligned}$$(43)and Eq. (37) for \(i=1\), it becomes evident that it is optimal to perform preventive maintenance on the system in state \((1,\text {SO},0)\). Similarly to Case (i), as \(F_1(\tau )-F_2(\tau ) < \frac{c_{\text {pm}}^{\text {uso}}}{p}\), there exists an \(\varepsilon >0\) for which (39) holds.

Repeating the same analysis as in Case (i), we can show that, for \(t\in [0,\tau ]\), the function \(F_1(t)-F_2(t)\) satisfies Eq. (40) and that it is a non-decreasing function if

$$\begin{aligned} F_1(\tau )-F_2(\tau ) - \frac{\mu _1c_{\text {cm}}}{\mu _1+\mu _2}<0 . \end{aligned}$$(44)However, for \(t=0\), we now have that

$$\begin{aligned} F_1(0) - F_2(0)&{\mathop {=}\limits ^{(34)}} V(1,SO,0) - V(2,SO,0) \nonumber \\&= c_{\text {pm}}^{\text {so}} + p F_2(\tau ) + (1-p) F_1(\tau ) -F_2(\tau ) \nonumber \\&=c_{\text {pm}}^{\text {so}} + (1-p)(F_1(\tau ) - F_2(\tau )). \end{aligned}$$(45)Combining (45) with (40) (on the domain \(t\in [0,\tau ]\)) yields

$$\begin{aligned} F_1(\tau )-F_2(\tau ) = \frac{\left( 1-e^{(\mu _1+\mu _2)\tau } \right) \frac{\mu _1 c_{\text {cm}}}{\mu _1 + \mu _2} - c_{\text {pm}}^{\text {so}}}{1-p- e^{(\mu _1+\mu _2)\tau } } . \end{aligned}$$(46)Combining Eqs. (43), (44), and (46) leads to the optimality condition for Case (ii). That is, if

$$\begin{aligned} (\mu _1+\mu _2)\frac{c_{\text {pm}}^{\text {so}}}{p}< \mu _1 c_{\text {cm}} < \left( \frac{c_{\text {pm}}^{\text {uso}}}{p} - \frac{c_{\text {pm}}^{\text {uso}} - c_{\text {pm}}^{\text {so}}}{e^{(\mu _1 +\mu _2)\tau }-1} \right) (\mu _1+\mu _2), \end{aligned}$$we perform preventive maintenance on the system if it is in state 1 at an SO but not at a USO.

**Case (iii):**:In state \((2,\text {SO},0)\), similarly to the previous case, it is optimal to not perform preventive maintenance. However, from the assumption

$$\begin{aligned} F_1(\tau )-F_2(\tau )> \frac{c_{\text {pm}}^{\text {uso}}}{p}> \frac{c_{\text {pm}}^{\text {so}}}{p} \end{aligned}$$(47)and Eq. (37) for \(i=1\), it becomes evident that it is optimal to perform preventive maintenance on the system in state \((1,\text {SO},0)\). Along the lines of the previous cases, as \(F_1(\tau )-F_2(\tau ) > \frac{c_{\text {pm}}^{\text {uso}}}{p}\), there exists an \(\varepsilon >0\) for which

$$\begin{aligned} F_1(t)-F_2(t) \ge \frac{c_{\text {pm}}^{\text {uso}}}{p}, \text { for all }t\in (\tau -\epsilon ,\tau ]. \end{aligned}$$(48)In this case, for \(t\in (\tau -\epsilon ,\tau ]\), we have that \(V(1,\text {USO},t)=c_{\text {pm}}^{\text {uso}} + pF_2(t) + (1-p) F_1(t)\) and \(V(2,\text {USO},t) = F_2(t)\) (cf. Eq. (36)). Taking a derivative with respect to

*t*in (34) and substituting the above-obtained values for \(V(1,\text {USO},t)\) and \(V(2,\text {USO},t)\) yields$$\begin{aligned} F_1'(t)-F_2'(t)&=-(\mu _1+\lambda )F_1(t)+\mu _1 V(1,\text {SC},t)+\lambda V(1,\text {USO},t)\nonumber \\&\quad +\, (\mu _2+\lambda )F_2(t)-\mu _2 V(2,\text {SC},t)-\lambda V(2,\text {USO},t) \nonumber \\&= -(\mu _1+\lambda )F_1(t) + \mu _1 (c_{\text {cm}} + F_2(t)) \nonumber \\&\quad +\, \lambda (c_{\text {pm}}^{\text {uso}} + pF_2(t) + (1-p) F_1(t))\nonumber \\&\quad +\, (\mu _2+\lambda )F_2(t) - \mu _2 F_1(t) - \lambda F_2(t) \nonumber \\&= -(\mu _1+\mu _2 + \lambda p) (F_1(t)-F_2(t)) + \mu _1 c_{\text {cm}}\nonumber \\&\quad +\, \lambda c_{\text {pm}}^{\text {uso}},\ t \in (\tau -\varepsilon ,\tau ]. \end{aligned}$$(49)The solution to the above differential equation reads

$$\begin{aligned}&F_1(t)-F_2(t) = \frac{\mu _1c_{\text {cm}} + \lambda c_{\text {pm}}^{\text {uso} }}{\mu _1+\mu _2 + \lambda p } \nonumber \\&\quad +\, \left( F_1(\tau )-F_2(\tau ) -\frac{\mu _1c_{\text {cm}} + \lambda c_{\text {pm}}^{\text {uso} }}{\mu _1+\mu _2 + \lambda p } \right) e^{(\mu _1+\mu _2 + \lambda p )(\tau -t)},\ t \in (\tau -\varepsilon ,\tau ]. \end{aligned}$$(50)Note that, if we assume that \(F_1(\tau )-F_2(\tau ) -\frac{\mu _1c_{\text {cm}} + \lambda c_{\text {pm}}^{\text {uso} }}{\mu _1+\mu _2 + \lambda p }\ge 0\), then we can extend (50) on the entire domain \(t\in [0,\tau ]\), and the function \(F_1(t)-F_2(t)\) is non-increasing. However, this is unfeasible. Note that, for \(t=0\), Eq. (34) yields \(F_1(0)=V(1,\text {SO},0){\mathop {=}\limits ^{(37)}}c_{\text {pm}}^{\text {so} }+pF_2(\tau )+qF_1(\tau )\) and \(F_2(0)=V(2,\text {SO},0){\mathop {=}\limits ^{(37)}}F_2(\tau )\), thus

$$\begin{aligned} F_1(0)-F_2(0)&= c_{\text {pm}}^{\text {so} }+q(F_1(\tau ) -F_2(\tau ))\ge F_1(\tau ) -F_2(\tau )\nonumber \\&\Leftrightarrow \ F_1(\tau )-F_2(\tau ) \le \frac{c_{\text {pm}}^{\text {so}}}{p}, \end{aligned}$$(51)which contradicts Assumption (47). Due to this contradiction, it is necessary to assume that \(F_1(\tau )-F_2(\tau ) -\frac{\mu _1c_{\text {cm}} + \lambda c_{\text {pm}}^{\text {uso} }}{\mu _1+\mu _2 + \lambda p }< 0\). This implies that the function \(F_1(t)-F_2(t)\) is non-decreasing and we can extend (50) on the domain \(t\in [t^*,\tau ]\), where \(t^*\) is such that \(F_1(t^*)-F_2(t^*)= \frac{c_{\text {pm}}^{\text {uso}}}{p}\), i.e.,

$$\begin{aligned}&F_1(t)-F_2(t) = \frac{\mu _1c_{\text {cm}} + \lambda c_{\text {pm}}^{\text {uso} }}{\mu _1+\mu _2 + \lambda p } \nonumber \\&\quad +\, \left( F_1(\tau )-F_2(\tau ) -\frac{\mu _1c_{\text {cm}} + \lambda c_{\text {pm}}^{\text {uso} }}{\mu _1+\mu _2 + \lambda p } \right) e^{(\mu _1+\mu _2 + \lambda p )(\tau -t)},\ t \in [t^*,\tau ]. \end{aligned}$$(52)See Fig. 8 for a visualization of \(F_1(t)-F_2(t) \).

From the definition of \(t^*\), and the continuity of \(F_1(t)-F_2(t) \), it follows that there exists an \(\varepsilon >0\) such that

$$\begin{aligned} F_1(t)-F_2(t) \le \frac{c_{\text {pm}}^{\text {uso}}}{p}, \text { for all }t\in (t^*-\epsilon ,t^*]. \end{aligned}$$(53)Note that if one were to assume that \(F_1(t)-F_2(t) \ge \frac{c_{\text {pm}}^{\text {uso}}}{p}\) for all \(t\in (t^*-\epsilon ,t^*]\), then due to Eq. (51), this would again contradict Assumption (47).

Now repeating the analysis performed in Case (i), albeit in a different domain, we can show that, for \(t \in [0,t^*]\),

$$\begin{aligned} F_1(t)-F_2(t)&= \frac{\mu _1c_{\text {cm}}}{\mu _1+\mu _2} + \left( F_1(t^*)-F_2(t^*) - \frac{\mu _1 c_{\text {cm}}}{\mu _1 + \mu _2} \right) \nonumber \\&\quad \times \, e^{(\mu _1+\mu _2)(t^*-t)},\quad \ t\in [0,t^*]. \end{aligned}$$(54)From the continuity of \(F_1(t)-F_2(t)\) at \(t=t^*\), we obtain

$$\begin{aligned} \frac{c_{\text {pm}}^{\text {uso}}}{p}= & {} \frac{\mu _1c_{\text {cm}} + \lambda c_{\text {pm}}^{\text {uso} }}{\mu _1+\mu _2 + \lambda p } + \left( F_1(\tau )-F_2(\tau ) -\frac{\mu _1c_{\text {cm}} + \lambda c_{\text {pm}}^{\text {uso} }}{\mu _1+\mu _2 + \lambda p } \right) \nonumber \\&\times \, e^{(\mu _1+\mu _2 + \lambda p )(\tau -t^*)}. \end{aligned}$$(55)Furthermore, setting \(t=0\) in Eq. (54) and using (51) yields

$$\begin{aligned} c_{\text {pm}}^{\text {so}} + (1-p)(F_1(\tau ) - F_2(\tau ))= & {} \frac{\mu _1c_{\text {cm}}}{\mu _1+\mu _2} \nonumber \\&+\, \left( \frac{c_{\text {pm}}^{\text {uso}}}{p} - \frac{\mu _1 c_{\text {cm}}}{\mu _1 + \mu _2} \right) e^{(\mu _1+\mu _2)t^*}. \end{aligned}$$(56)Note that Eqs. (55) and (56) form a system of two equations with two unknowns, which produce a unique solution for \(t^*\), cf. Eq. (1). Since \(F_1(t) - F_2(t)\) is a continuous function throughout \([0,\tau )\), we can directly use the optimality condition for Case (ii) to state the optimality condition for this case. That is, if

$$\begin{aligned} \mu _1 c_{\text {cm}} > \left( \frac{c_{\text {pm}}^{\text {uso}}}{p} - \frac{c_{\text {pm}}^{\text {uso}} - c_{\text {pm}}^{\text {so}}}{e^{(\mu _1 +\mu _2)\tau }-1} \right) (\mu _1+\mu _2), \end{aligned}$$we perform preventive maintenance on the system if it is in state 1 at an SO and at a USO for which the residual time until the next SO is in the interval \([{\hat{t}},\tau )\), with \({\hat{t}} = \min \{\tau ,\max \{0,t^*\}\}\).

**Case (iv):**:This case follows evidently by performing again the steps of Case (iii) for \(t^*=\tau \). \(\square \)

### Proof of Theorem 2

### Proof

Similarly to the proof of Theorem 1, we need to make certain assumptions here regarding the actions at the given opportunities. In particular, we distinguish four cases, each corresponding to a different set of actions: Case (i): \(F_1(\tau )-F_2(\tau )\le \frac{c_{\text {pm}}^{\text {uso}}}{p}\); Case (ii): \(\frac{c_{\text {pm}}^{\text {uso}}}{p}< F_1(\tau )-F_2(\tau )< \frac{c_{\text {pm}}^{\text {so}}}{p}\); Case (iii): \(\frac{c_{\text {pm}}^{\text {so}}}{p} < F_1(\tau )-F_2(\tau )\); Case (iv): \(F_1(\tau )-F_2(\tau )= \frac{c_{\text {pm}}^{\text {so}}}{p}\). The proof of this theorem is similar in structure to the proof of Theorem 1, and for this reason it is omitted. \(\square \)

### Proof of Theorem 4

### Proof

We first focus on the derivation of the cycle length appearing in the denominator of Eq. (8). Observe that the length of a renewal cycle consists of the time the system spends in state 2 plus the time from the state change \(2 \rightarrow 1\) until the first successful maintenance. To this purpose, let \(C\!L\) denote the length of the part of the renewal cycle that the underlying stochastic process spends in state 1. Furthermore, let *Y* denote the random amount of time from a state change \(2\rightarrow 1\) to the first SO. We then have, for the probability density function of *Y*, that

which leads to Eq. (13). Conditioning on *Y*, a renewal cycle can either end before the first SO, or at the first SO, or after the first SO. Hence, we have that the expected cycle length is equal to

We first focus on deriving expressions for the individual expectations in Eq. (57). Note that the first successful maintenance can be of type \(j \in \{\text {SC},\text {SO},\text {USO}\}\) and may occur in the interval \([t, t']\), this is in short denoted by \(j\,[t, t']\). Thus, rewriting the first part in Eq. (57) results in (cf. Eq. (9))

For the second expectation in Eq. (57), observe that the length of this part can be further decomposed: First the system goes through a geometric number of intervals of length \(\tau \) in which no successful maintenance activity takes place, after which the system enters the last interval in which the successful maintenance activity takes place. To this end, let \(p_u\) be the probability that there is no successful maintenance activity in an arbitrary interval between two SOs (including the SO with which this interval ends) after the state change \(2\rightarrow 1\), i.e.,

We then have, from the memoryless property of \(T_{\mu _1}\) and \(T_{\lambda p}\),

where \({\mathbb {E}}\left[ C\!L'\, \mathbb {1}_{\{CL'\le Y\}} \,|\, Y=\tau \right] \) is the expected length of the last part of the renewal cycle, i.e., the interval in which the successful maintenance activity takes place. Analogously to Eq. (58), we can further decompose \({\mathbb {E}}\left[ C\!L'\, \mathbb {1}_{\{C\!L'\le Y\}} \,|\, Y=\tau \right] \) by conditioning on the type of the successful maintenance activity with which it ends.

We are now left with defining the events that lead to \(j\,[t, t']\), such that we can calculate the expectations in Eqs. (17)–(19). With respect to \(\text {SO}[\tau -y, \tau ]\), observe that if \(y\in [0,{\tilde{t}})\), \(\mathbb {1}_{\{ \text {SO}{[\tau -y,\tau ]}\}}\) is equal to 1 if \(T_{\mu _1}>y\), since we do not take any USOs. If \(y\in [{\tilde{t}},\tau ]\), no successful USOs in \([\tau -y,\tau -{\tilde{t}}]\) can occur and \(T_{\mu _1}>y\) for \(\mathbb {1}_{\{ \text {SO}{[\tau -y,\tau ]}\}}\) to be equal to 1. Combining this leads to Eq. (14). Equations (15) and (16) are obtained along similar lines. Note that all expectations and probabilities only involve exponentially distributed random variables. Consequently, closed-form expressions can be obtained using straightforward calculus. However, for the sake of brevity, we have chosen to provide one closed-form expression and omit the rest (which can be obtained analogously). For Eq. (17), we have for \(y>{\tilde{t}}\):

We now focus on the numerator of Eq. (8), i.e., the expected cycle cost. To that end, let \(C\!C\) be the cost incurred in a renewal cycle. The analysis for the expected cycle cost, \({\mathbb {E}}\left[ C\!C\right] \), is similar to the analysis of the expected cycle length. Again, we decompose the length of a renewal cycle into three parts (i.e., the interval after the state change until the first SO, the geometric number of intervals of length \(\tau \) in which no successful maintenance activity takes place, and the last interval in which the successful maintenance activity takes place), and compute the conditional expected cycle costs in these parts (mainly consisting of costs incurred at unsuccessful maintenance activities). Thus,

We first focus on the first part in Eq. (60) and condition further on the type of activity, which yields

Analogously to the expected cycle length, the expected cost incurred during the geometric number of intervals of length \(\tau \), in which no successful maintenance activity takes place, is equal to

Observe that the expected cost in the interval in which the successful maintenance activity takes place is composed of two parts regardless of the type of activity, i.e., the cost of the successful maintenance activity itself and the cost related to the unsuccessful USOs up to the successful maintenance activity (see Eqs. (20)–(22)). Again, all expectations and probabilities related to the costs only involve exponentially distributed random variables, and again, for the sake of brevity, we have chosen to provide one closed-form expression and omit the rest (which can be obtained analogously). For Eq. (21), we have,

with

which completes the proof. \(\square \)

## Rights and permissions

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## About this article

### Cite this article

Drent, C., Kapodistria, S. & Resing, J.A.C. Condition-based maintenance policies under imperfect maintenance at scheduled and unscheduled opportunities.
*Queueing Syst* **93**, 269–308 (2019). https://doi.org/10.1007/s11134-019-09627-w

Received:

Revised:

Published:

Issue Date:

DOI: https://doi.org/10.1007/s11134-019-09627-w

### Keywords

- Markov decision processes
- Condition-based maintenance
- Opportunistic maintenance

### Mathematics Subject Classification

- 90B25
- 90C40
- 60K15
- 60J20