Condition based maintenance policies under imperfect maintenance at scheduled and unscheduled opportunities

Motivated by the cost savings that can be obtained by sharing resources in a network context, we consider a stylized, yet representative model, for the coordination of maintenance and service logistics for a geographic network of assets. Capital assets, such as wind turbines in a wind park, require maintenance throughout their long lifetimes. Two types of preventive maintenance are considered: planned maintenance at periodic, scheduled opportunities, and opportunistic maintenance at unscheduled opportunities. The latter type of maintenance arises due to the network context: when an asset in the network fails, this constitutes an opportunity for preventive maintenance for the other assets in the network. So as to increase the realism of the model at hand and its applicability to various sectors, we consider the option of not-deferring and of deferring planned maintenance after the occurrence of opportunistic maintenance. We also assume that preventive maintenance may not always restore the condition of the system to `as good as new'. By formulating this problem as a semi-Markov decision process, we characterize the optimal policy as a control limit policy (depending on the remaining time until the next planned maintenance) that indicates on the one hand when it is optimal to perform preventive maintenance and on the other hand when maintenance resources should be shared if an opportunity in the network arises. In order to facilitate managerial insights on the effect of each parameter on the cost, we provide a closed-form expression for the long-run rate of cost for any given control limit policy (depending on the remaining time until the next planned maintenance) and compare the costs (under the optimal policy) to these of sub-optimal policies that neglect the opportunity for resource sharing. We illustrate our findings using data from the wind energy industry.


Introduction
High valued capital assets, such as energy systems (e.g., wind turbines), medical systems (e.g., interventional X-ray machines), lithography machines in semiconductor fabrication plants, and baggage handling systems at its corrective maintenance instance can be viewed as an unscheduled opportunity for preventive maintenance for the other assets in the network. In these instances, opportunistic maintenance can take place, with the respective instances constituting the unscheduled opportunities of preventive maintenance. This form of network dependency can be viewed on two levels: (i) the economic dependency between the various systems of a network, and (ii) the structural degradation and failure dependencies. Similarly to planned maintenance, opportunistic maintenance has a lower cost in comparison to that of corrective maintenance.
Incorporating opportunistic maintenance may also affect the scheduling of planned maintenance, as it might be beneficial to defer the planned maintenance opportunity to take place after a period of length τ after the occurrence of an opportunistic maintenance. This decision of deferring or not the scheduling of planned maintenance after the occurrence of opportunistic maintenance may have a positive or negative effect on the total costs.
In maintenance, it is oftentimes assumed that a maintenance activity is perfect, i.e. it restores the system to a state of 'as good as new'. However, this assumption may not be true in practice. For instance, a misidentification of the root cause of the (imminent) failure can lead to an erroneous repair not resolving the actual issue, or some minor repair activity (such as exchange of parts, changes or adjustment of the settings, software update, lubrication or cleaning, etc. see (Spinato et al., 2009)) may not restore the system to a state of 'as good as new'. In the above mentioned cases, it is more reasonable to assume that the system is restored to a state between 'as bad as old' and 'as good as new'. This concept will be referred to as imperfect maintenance. Evidently, this assumption impacts the resulting cost. Hence, knowledge regarding the degree of how successful a maintenance activity is should not be ignored in the maintenance planning.
In conclusion, asset owners are oftentimes faced with the following questions: (i) What is the advantage of incorporating planned maintenance in comparison to exercising only corrective maintenance?
(ii) What is the benefit of sharing resources in the network (in the form of incorporating opportunistic maintenance in addition to the planned maintenance)?
(iii) What is the influence of deferring the planned maintenance after the occurrence of opportunistic maintenance?
(iv) What is the influence of imperfect maintenance on the maintenance planning and on the costs (long-run rate of cost)?
(v) When should preventive maintenance be performed (so as to minimize the long-run rate of cost)?

Main contributions
We consider a stylized, yet representative model that incorporates the above-mentioned characteristics and we derive the structure of the optimal maintenance policies. Furthermore, we compute an explicit expression for the long-run rate of cost, which can be easily used by asset owners and service providers so as to gain further insights into their practice and so as to compute the cost-benefits of changing their maintenance practice.
More concretely, the main contributions of the paper are threefold: (1) We consider a semi-Markov decision process that incorporates planned and opportunistic maintenance, as well as imperfect maintenance. From the analysis of the semi-Markov decision process stems the characterization of the optimal policy as a control limit policy (threshold) depending on the time until the next planned maintenance opportunity. Moreover, using this approach, we are able to derive a closed-form expression for this control limit.
(2) Considering the class of control limit policies (depending on the remaining time until the next planned maintenance), we derive, using the theory of regenerative processes, an explicit expression for the long-run rate of cost.
(3) We consider data from the wind energy industry and provide, based on these values, concrete answers to Questions (i)-(v) mentioned above. More specifically, we analyze the benefit of using planned and opportunistic maintenance compared to only corrective maintenance. We also analyze the influence of deferring planned maintenance after the occurrence of opportunistic maintenance. Finally, we also highlight the cost savings that can be attained by reducing the probability of an imperfect maintenance.

Outline of this paper
The remainder of this paper is structured as follows: In Section 2, we review the related literature. In Section 3, we describe in detail the model at hand, which captures the condition of the asset and which incorporates imperfect maintenance at scheduled and unscheduled maintenance opportunities. Subsequently, in Section 4, we characterize the structure of the optimal policy for condition based maintenance using the average cost criterion, see Section 4.1, and we compute the long-run rate of cost for any policy with the same structure as the optimal policy (i.e. the class of control limit policies depending on the remaining time until the next planned maintenance), see Section 4.2. In Section 5, we permit the deferral of planned maintenance after the occurrence of opportunistic maintenance, and we compute the long-run rate of cost. A numerical illustration is provided in Section 6, where, based on data from the wind energy industry, we compare the long-run rate of cost for various policies, we show the effect of imperfect maintenance, and the effect of deferring planned maintenance. Finally, Section 7 contains concluding remarks and highlights directions for future research.

Literature review
Maintenance optimization models have been extensively studied in the literature. Optimal maintenance policies aim to provide optimal system reliability/availability and safety performance at lowest possible maintenance costs (Pham and Wang, 1996). Due to the fast development of sensing techniques in recent years, the state of a capital asset can be monitored or inspected at a much lower cost and in a continuous fashion, which facilitates condition based maintenance. Condition based maintenance recommends maintenance actions based on information collected through online monitoring of the capital asset and it can significantly reduce maintenance costs by decreasing the number of unnecessary maintenance operations, see e.g., Jardine et al. (2006); Peng et al. (2010); Lam and Banjevic (2015). The condition based maintenance model that we propose builds on the delay time model proposed by Christer (1982) and Christer and Waller (1984). We refer the reader to Baker and Christer (1994); Christer (1999), and Wang (2008), and more recently, Wang (2012) for an overview on delay time models. Not only are delay time models well-known in literature, but they are also very frequently appearing in practice.
Practice-based research with real diagnostic data, such as data related to the spectrometry of oil (e.g., Makis et al., 2006;Kim et al., 2011) and data related to vibrations (e.g., Yang and Makis, 2010), showed that it is usually sufficient, and even preferable from a modeling and decision making perspective, to consider only two operational states. The first state is the perfect state, in which the system lasts from newly installed to the point that a hidden defect has been identified. After the occurrence of a hidden defect in the system until the occurrence of a failure (which is typically referred to as the delay time), the system resides in the second state also referred to as the satisfactory state. Such a classification of the operational states has the property that maintenance actions are initiated only when the system is degraded to the state that can actually lead to a direct failure, i.e. the satisfactory state, but not when the system is functioning perfectly, i.e. the perfect state. The vast majority of the literature on delay time models is restricted to numerical methods or approximations to solve the models at hand, due to their underlying complexity. Few recent exceptions are Maillart and Pollock (2002), Kim and Makis (2013) and Van Oosterom et al. (2014), who study two-state systems under periodic inspection, partial observability, and postponed replacement, respectively, and provide analytical results regarding the structure of the optimal policy. However, all of them do not consider the option of resource sharing in the network (in the form of opportunistic maintenance), nor do they incorporate the notion of imperfect repair.
Most delay time model analyses assume that the system after a maintenance action is restored to a state of 'as good as new'. Contrary to this assumption, in imperfect maintenance it is assumed that upon preventive maintenance, the system lies in a state somewhere between 'as good as new' and 'as bad as old'. This is first introduced by Nakagawa (1979a,b) and is called the (p, q)-rule. Under the (p, q)-rule, the system is returned to an 'as good as new' state (perfect preventive maintenance) with probability p and it is returned to the 'as bad as old' state (minimal preventive maintenance) with probability q = 1 − p after preventive maintenance.
Clearly, the case p = 0 corresponds to having no preventive maintenance. Also, from a practical point of view, imperfect maintenance can describe a large set of realistic maintenance actions (Pham and Wang, 1996).
When planning condition based maintenance strategies, see, e.g., Jardine et al. (2006); Jardine and Tsang (2005); Prajapati et al. (2012), a typical assumption in the literature is that the system at hand is monitored continuously and one can intervene and maintain the system at any given moment. However, due to accessibility reasons (e.g., in the case of off-shore wind parks) or for cost reduction purposes, it is cost optimal and more practical to allow only for discrete time opportunities. The simplest amongst the discrete time opportunities are the periodic planned maintenance instances (also referred to as scheduled downs), with period say τ , that serve as a scheduled opportunity to do maintenance for a network of systems. Furthermore, unplanned maintenance instances (due to opportunistic maintenance) can be modeled as discrete instances occurring according to a multi-dimensional counting process.
For recent works related to opportunistic maintenance, the interested reader is referred to Zhu et al. (2016Zhu et al. ( , 2017; Arts and Basten (2018); Kalosi et al. (2016). In Zhu et al. (2016) and Zhu et al. (2017), the authors consider a single-unit system and account for both scheduled and unscheduled opportunities.
In these analyses, the authors model the age and the condition, respectively, of the system and derive, based on approximations, the long-run rate of cost under a given policy. In both papers, the arrivals of unscheduled opportunities are modeled according to a homogeneous Poisson process. This approximation is justified by the Palm-Khintchine theorem (Khinchin, 1956), which states that even if the failure times of some systems do not follow exponential distributions, the superposition of a sufficiently large number of independent renewal processes behaves asymptotically like a Poisson process. Arts and Basten (2018) build further on Zhu et al. (2016Zhu et al. ( , 2017, but they only consider scheduled maintenance opportunities (excluding unscheduled opportunities). Furthermore, Arts and Basten (2018) assume that at a scheduled opportunity, the system is restored to a perfect condition (i.e. p = 1), while at a failure they assume that the system is restored to a state which is stochastically identical to the state just prior to the system's failure. In a recent conference paper, Kalosi et al. (2016) looked at a model with both planned and unplanned maintenance opportunities, at which the system is restored to a perfect condition, showing some preliminary results that a control limit policy (depending on the remaining time until the next planned maintenance) is optimal.
In contrast to Arts and Basten (2018) and to Zhu et al. (2016Zhu et al. ( , 2017, in which the long-run rate of cost is computed for a given policy, we first characterize the structure of the optimal policy explicitly and thereafter, for the optimal policy class, we compute the long-run rate of cost. Furthermore, we include both scheduled and unscheduled maintenance opportunities. In contrast to Kalosi et al. (2016), we extend the model by incorporating the (p, q)-rule, making it more generic and realistic. Moreover, we are the first to analyze the influence of deferring planned maintenance and we illustrate the financial effects of the maintenance policy in a realistic context using data stemming from the wind industry.

Model description
We consider a single unit system (equivalently, a component or asset) that is monitored continuously and whose condition is fully observable. We assume that the condition of the system degrades over time and that it can be modeled according to a delay time model. That is, the states are classified as perfect, satisfactory and failed. We shall refer to the state of perfect condition as state 2, the state of satisfactory condition as state 1 and the failure state as state 0. Furthermore, we assume that as soon as a system failure occurs, the system is instantaneously replaced by an 'as good as new' system. So, in the mathematical formulation of the model, we may assume, due to the instantaneous replacement at failure, that the model evolves between only states 1 and 2. The system spends an exponential amount of time with rate µ i in state i, i ∈ {1, 2}.
The above model formulation implies that initially the system starts in state 2 (perfect state), then after an exponential amount of time with rate µ 2 , the system deteriorates and the condition of the system goes to state 1 (satisfactory state). The system spends an exponential amount of time with rate µ 1 in state 1, after which a failure occurs. At a failure the system is instantaneously replaced by an 'as good as new' system and the condition is restored to 2 (perfect state). A schematic evolution of the condition of the component and the corresponding times of transitions are depicted in Figure 1. We assume that we have two types of opportunities in which we can perform preventive maintenance (PM) before failure: the scheduled and the unscheduled opportunities. The scheduled opportunities correspond to pre-arranged opportunities occurring according to a fixed schedule. These opportunities can be attributed to either service/maintenance agreements or to regulation imposition checks. We assume that the scheduled opportunities occur at epochs τ, 2τ, 3τ, . . ., with τ > 0. This is also in accordance with what happens in practice as maintenance actions once planned are typically not rescheduled. The unscheduled opportunities correspond to random opportunities triggered by failures of other systems in close proximity. We assume that these unscheduled opportunities occur according to a Poisson process at rate λ.
The unscheduled and scheduled opportunities, abbreviated by USO and SO, respectively, serve as opportunities to perform preventive maintenance. Such a preventive maintenance is assumed to cost less than a corrective maintenance (CM) upon failure, which costs c cm . Moreover, incorporating a planning perspective, we may assume that the preventive maintenance cost at an SO, c so pm , is less than or equal to the corresponding cost at a USO, say c uso pm , that is 0 < c so pm ≤ c uso pm < c cm (however, we also extend our analysis to the case c so pm > c uso pm ). Following the (p, q)-rule of Nakagawa (1979b,a), we assume that after preventive maintenance a system is returned to the 'as good as new' state with probability p ∈ (0, 1] and returned to the 'as bad as old' state (i.e. the amount of time left until the failure has not altered) with probability q = 1 − p.
Our aim is to determine a policy when to perform preventive maintenance on the system based on its condition and the opportunity type, i.e. scheduled or unscheduled. More explicitly, we will need to formally define the state space, which refers to the condition of the system, the action space and the decision epochs. The state space is governed by the process depicting the condition of the system, i.e. the Markov chain evolving between the states {1, 2}. The action space consists of only two actions: perform preventive maintenance or do nothing. Lastly, the decision epochs are the SO and USO epochs. In Figure 2, we depict the SO epochs by ( * ) and the USO epochs by (o).

Optimal policy
The goal of this section is twofold: We first characterize the structure of the optimal average cost condition based maintenance policy. We then derive an explicit form for the long-run rate of cost per time unit for any given policy that has the same structure as the optimal policy.

Average cost criterion
This section is devoted to the derivation of the optimal policy on when to perform preventive maintenance for the system at hand using the average cost criterion. To this purpose, we set up our problem as a (controlled) semi-Markov decision process. Due to the stochastic nature of the problem, it does not suffice to know the type of the decision epoch (SO or USO), but it is also required to keep track of the remaining time till the next SO. That time may impact our decision, i.e. the optimal policy may depend on the residual time till the next SO. Thus, for the full description of the condition (state) of the system, we use a triplet descriptor and for this reason we use techniques that stem from semi-Markov decision processes. At each decision epoch (depending on the values of (i, j, t) ∈ S), we can choose to perform preventive maintenance or do nothing or in case of a failure to do corrective maintenance (CM), that is A = {perform PM, do nothing, perform CM}, where A represents the overall action space. The optimal condition based maintenance policy is described by the following theorem.
Theorem 1. Under the assumption that c so pm < c uso pm and given the imperfect preventive maintenance probability 1 − p ∈ (0, 1], the optimal policy under the average cost criterion is: For state 2 to do nothing. For state 1 to perform preventive maintenance at scheduled opportunities, if µ 1 c cm > (µ 1 + µ 2 ) c so pm p , and to do nothing otherwise, and to perform preventive maintenance at unscheduled opportunities for which the residual time (1) Proof. See Appendix A.1.
For USOs, Theorem 1 establishes a control limit policy depending on the remaining time until the next SO: if the residual time until the next SO is smaller thant, then it is optimal to not take the opportunity to perform preventive maintenance in state 1. This is intuitive in the sense that the urgency for preventive maintenance in state 1 at a USO should decrease as the cheaper opportunity at an SO is approaching.
Note that in the special case when preventive maintenance costs at SOs and USOs are equal, the optimal policy reduces to a stationary control limit policy, which is shown in Proposition 1.
Proposition 1. Under the assumption that c so pm = c uso pm = c pm > 0 and given the imperfect preventive maintenance probability 1 − p ∈ (0, 1], the optimal policy under the average cost criterion is: For state 2 to do nothing. For state 1 to perform preventive maintenance at both SOs and USOs, if µ 1 c cm > (µ 1 + µ 2 ) cpm p , and to do nothing otherwise.
Proof. See Appendix B.
One could also argue that the cost for preventive maintenance at a USO is actually less than the cost at an SO since there is already a cost attached to the opportunity that you are taking (e.g., service engineers are already at a wind park and they can at a small extra cost repair other systems in close proximity as well).
In this case, the optimal control policy also reduces to a stationary control limit policy, which is described in Theorem 2.
Theorem 2. Under the assumption that c so pm > c uso pm and given the imperfect preventive maintenance probability 1 − p ∈ (0, 1], the optimal policy under the average cost criterion is: For state 2 to do nothing. For state 1 to perform preventive maintenance at an unscheduled opportunity if µ 1 c cm > (µ 1 + µ 2 ) c uso pm p , and to do nothing otherwise, and to perform preventive maintenance at an SO if µ 1 c cm > (µ 1 + µ 2 ) c so pm p + λ(c so pm − c uso pm ), and to do nothing otherwise.
Proof. See Appendix C.

Long-run rate of cost per time unit
In the previous section, we characterize the structure of the optimal policy using the average cost criterion.
This policy can be viewed as a control limit policy, with the control limit depending on the time until the next SO. In this section, we consider such a policy and we compute the long-run rate of cost per time unit.
More concretely, we consider a policy under which in state 2 we do not perform preventive maintenance (i.e. we do nothing), and in state 1 we perform always preventive maintenance at SOs and we perform preventive maintenance at USOs if the remaining time till the next SO is greater thant, for some given valuet ∈ (0, τ ).
The results obtained in this section are directly applicable to the results of Section 4.1, by directly setting t = t * , cf. Theorem 1.
For the computation of the long-run rate of cost per time unit, we employ the theory of regenerative-like processes, also called stationary-cycle processes, described in Section 2.19 of Serfozo (2009). To this purpose, we consider the inter-regeneration times created by the SOs {τ, 2τ, 3τ, . . .}. For the cost computation, we assume that, at the SOs, the system is in state 1 or 2 according to a stationary probability p 1 (0) and p 2 (0), respectively. The long-run rate of cost per time unit is calculated as the expected total cost incurred between consecutive SOs divided by τ .
Let p i (t) be the probability that the system is in state i ∈ {1, 2} given that the time until the next SO is t ∈ [0, τ ), then the long-run rate of cost per time unit for this control limit policy (depending on the remaining time until the next planned maintenance) for any given time threshold is given in the next theorem.
Theorem 3. Consider a given policy under which in state 2 we opt for the action do nothing, and in state 1 we repair at scheduled opportunities and at unscheduled opportunities for which the remaining time until the next scheduled opportunity is greater thant ∈ (0, τ ), and we do nothing otherwise. Under this policy, the long-run rate of cost per time unit equals with where the constants C 1 and C 2 are obtained as follows Proof. The expected total cost incurred in one cycle consists of three parts (cf. Equation (2)), which are related to the expected cost associated with preventive maintenance at SOs, with preventive maintenance at USOs and with corrective maintenance, respectively. It is now sufficient to derive p i (t) for t ∈ [0, τ ), For t ∈ [t, τ ), the time-dependent behavior of p 1 (t) is governed by Subtracting p 1 (t + dt) from both sides and rewriting the above equation yields Dividing this expression with dt and letting dt → 0 results in Equation (5) is easily obtained by considering a small time interval of length dt, and noticing that at time t we are in state 1 either due to a transition from state 2 with infinitesimal probability µ 2 dt or we have remained in state 1 with infinitesimal probability 1 − (µ 1 + λp) dt. Following a similar analysis for p 2 (t), yields the following system of differential equations, for t ∈ [t, τ ), Similarly, for t ∈ [0,t), we have Solving the system of differential Equations (6) and (7) leads to the desired solutions (3) and (4), respectively.
In this process, we would need to compute four unknown constants. This is achieved by using: (i) the normalizing condition, i.e. p 1 (t)+p 2 (t) = 1 for all t ∈ [0, τ ), (ii) the continuity condition att, i.e. lim

Special cases
In case of only scheduled opportunities, which corresponds to the caset → τ or, equivalently, to the case λ → 0, the probabilities p i (t) for i ∈ {1, 2} are derived from the system of linear equations in (7) plus the normalizing condition, i.e. p 1 (t) + p 2 (t) = 1 for all t ∈ [0, τ ). This yields Plugging the above result into Equation (2), after appropriately considering in Equation (2) only the costs related to preventive maintenance at SOs and corrective maintenance leads to the long-run rate of cost per time unit in the case of only SOs.
In case of perfect maintenance, i.e. in case p = 1, the boundary condition at the SOs imposed by the policy and the imperfect maintenance in the proof of Theorem 3 reduces to lim t→τ − p 1 (t) = 0, as immediately after an SO, the system is restored to state 2 with probability 1. This enables us to explicitly solve the system of linear Equations (6) and (7), yielding Combining this expression with Equation (2), results in the long-run rate of cost per time unit in the case of perfect maintenance.
In case of only unscheduled opportunities, which is equivalent to considering τ → ∞, the condition of the system can be fully described using a double descriptor S = (i, j) : i ∈ {1, 2}, j ∈ {SC, USO} which is independent of time, and thus the new model formulation falls into the framework of regular Markov decision processes. It can be easily shown that: For state 2, the optimal policy is to do nothing, and, for state 1, the optimal policy is to repair if (µ1+µ2)c uso pm p < µ 1 c cm and to do nothing otherwise. Furthermore, under the optimal policy the average long-run rate of cost is equal to In case of only corrective replacements, the long-run rate of cost is equal to c cm µ 1 µ 2 µ 2 + µ 1 .

Deferring planned maintenance
In this section, we consider that upon a successful maintenance activity (preventively, at an SO or at a USO, or correctively), the upcoming planned maintenance is deferred for a period of length τ , i.e. at the instances of successful maintenance the remaining time till the next SO is set equal to τ . We are interested in computing the long-run rate of cost under deferred maintenance and, in Section 6.3, using the results of this section and of the previous sections, in investigating the economical benefits of deferring planned maintenance.
Analogously to the analysis of Section 4.2, we derive the long-run rate of cost using renewal theory, see, e.g., (Ross, 2014, Proposition 7.3, page 433). In this case, we consider the renewal points to be the instances at which there was a successful maintenance activity, i.e. the SOs or USOs at which the preventive maintenance was perfect, or the epoch at which corrective maintenance is performed. Note that the underlying stochastic process that governs the condition of the system, regenerates after each successful maintenance activity. That is, after each successful maintenance activity the underlying stochastic process is in state 2 with probability 1. The long-run rate of cost per time unit for a policy in the class of optimal policies is given in the next theorem. As the expressions appearing in the theorem do not simplify upon further computations, we chose to present them in the form of probabilities and expectations associated with the exponential distribution, as these expressions are straightforward (though cumbersome to compute) and shed insight on each of the individual events participating in the final expression, cf. Equation (8).
Theorem 4. Consider a given policy under which in state 2 we do nothing, and in state 1 we repair at scheduled opportunities and at unscheduled opportunities for which the remaining time until the next scheduled opportunity is greater thant ∈ (0, τ ), and we do nothing otherwise. Furthermore, consider that planned maintenance is deferred after a successful maintenance. Under this setting, the long-run rate of cost per time unit equals with where the density of the truncated exponential random variable Y is given by and with, for 0 ≤ y ≤ τ , where ı {x} is an indicator function taking value 1 if event x occurs, and it is zero otherwise, T µ1 ∼ Exp(µ 1 ), Proof. See Appendix D.

Numerical results
Using the results and the analyses of the previous sections, in this section, we illustrate through a few well chosen examples the effect of the various parameters in the long-run rate of cost.
In these examples, we investigate the financial advantage of the optimal policy, when compared to other (suboptimal) policies.
Furthermore, we highlight the financial benefit of perfect maintenance by comparing the long-run rate of cost for the perfect maintenance model (p = 1) to that of the imperfect maintenance model (p ∈ (0, 1)). Here, we also show the influence of imperfect maintenance on the maintenance planning. In addition, we illustrate the change introduced by the action of deferring planned maintenance after the occurrence of a successful maintenance. To illustrate the financial effects in a realistic context and to connect our analysis with the practice, we use values and data stemming from the wind industry.
6.1 Comparison of the optimal policy to suboptimal policies In this section, we compute, in the context of the wind industry example, the long-run rate of cost under the optimal policy and we examine how it is affected by varying one by one the parameters τ , λ and c uso pm , while keeping all other parameters fixed. For the determination of the values used in the numerical computations of this section, we consider the gearbox of a wind turbine. Statistics from a recent field study by Ribrant and Bertling (2007) on Swedish wind parks in the period 1997-2005 showed that the gearbox is the most critical unit of a wind turbine. The notion of criticality is determined by the fact that a failure of the gearbox leads to the highest downtime when compared to all other wind turbine components, but also by the fact that this component has the highest failure rate among all wind turbine components (Ribrant and Bertling, 2007;Tavner et al., 2007;Spinato et al., 2009). Due to its extended downtime after a failure (which is captured in the corresponding maintenance cost), the corrective cost of a gearbox is relatively high compared to preventive maintenance costs, see, e.g., Nilsson and Bertling (2007). Based on the values reported in the aforementioned studies, we set c cm = 300000, c so pm = 1000, µ 2 = 0.31, µ 1 = 0.31 and p = 0.6. In this case, the long-run rate of cost (in euros per year) in case of only corrective replacements is equal to 46500. Furthermore, motivated by the wind industry practice, we choose three different values for τ , that is τ ∈ {0.25, 0.5, 1} (years). Next, we consider three different values for c uso pm , i.e. c uso pm ∈ {2000, 3000, 4000}. Finally, with regard to λ, we consider four different values, i.e. λ ∈ {0.5, 1, 2, 4}.
In Table 2, we depict the long-run rate of cost for the above mentioned values under four different policies: The first policy corresponds to replacements only at USOs (π uso ). The second policy corresponds to replacements only at SOs (π so ). The third policy is the optimal policy (π opt ), which is derived in Theorem 1. Note, that it is numerically easier to obtain the optimalt by minimizing the long-run rate of cost in Theorem 3, instead of the closed-form expression in Theorem 1, as the latter requires the derivation of a root solution. The fourth policy concerns the optimal policy, but for p = 1. This assumption is motivated from the practice, as it is oftentimes difficult to exactly determine the value of p and it is typically assumed that after a maintenance the component is restored to a perfect state. This policy is denoted by π opt .
In Table 2, we observe, across all instances, that incorporating planned maintenance can significantly reduce costs compared to only corrective maintenance, which can be reduced even further by adding opportunistic maintenance. Intuitively, due to the cost structure, only planned maintenance at SOs can considerably improve the long-term rate of cost when compared to performing only opportunistic maintenance at USOs.
Finally, if we compare π opt with π opt we do not, despite the low value for p, observe significant differences.
From an operational management perspective, this clearly implies that, if decision makers do not have any  knowledge about the value of p and given a similar cost structure as in the gearbox case, assuming perfect maintenance will result in a long-run rate of cost that is close to optimal regardless of the true value of p. This will be valid as long as the preventive maintenance cost (at both opportunities) is very small in comparison to the corrective maintenance cost, as is the case of the gearbox costs. As a rule of thumb, one can easily compute the expected number of maintenances (planned or opportunistic) required for a successful preventive maintenance and based on this compute the long-run rate of preventive maintenance cost (approximately in the order of max{c so pm , c uso pm }/p) and compare it with the corrective cost. If the corrective cost is significantly higher, then one may assume that there is no significant difference between π opt and π opt , and as a consequence there is no significant difference in the values of the optimal policies under the imperfect and perfect maintenance. In the next section, we investigate the savings that can be obtained by improving the performance of a repair when a decision maker has some knowledge regarding the value of p.

Influence of imperfect maintenance
Let π (p) opt represent the optimal policy as a function of the successful preventive maintenance probability p and let C(π (p) opt ) denote the long-run rate of cost when the policy is π (p) opt . To demonstrate the effect of p in the rate of cost, we compute the relative difference in the cost of not having a perfect preventive maintenance as a function of p. This relative difference is denoted by δ(p) and it is equal to δ(p) indicates how much extra cost is incurred due to imperfect maintenance, and thus shows the benefit of improving the probability of executing a perfect maintenance.
In this numerical example, similarly to before we choose µ 2 = 0.31, and µ 1 = 0.31. Furthermore, we set λ = 4 and τ = 1. Figure 3 shows δ(p) for p ∈ [0.5, 1] under two different cost structures (denoted by δ(p) 1 and δ(p) 2 , respectively). Figure 4 depicts the corresponding optimal values fort for both cost structures, denoted by t 1 and t 2 , respectively. We use the same cost structure as in the previous section, i.e. for δ(p) 1 , we consider c so pm = 1000, c uso pm = 2000 and c cm = 300000, whereas, for δ(p) 2 , we consider c so pm = 26500, c uso pm = 28800 and c cm = 75500. The choice for the preventive maintenance cost at SOs and USOs in the second cost structure is common in the lithography industry (see Zhu et al. (2017)). Based on Figure 3, we can conclude that, under both cost structures, significant costs can be saved by improving the probability of executing a perfect preventive maintenance (e.g., by training). The optimal policy (t), denoted by t 1 and t 2 , under the first and second cost structure, respectively, is equal to t 1 ≈ 0.08 and t 2 ≈ 0.39 in case of perfect repairs. In Figure 4, where we plot t 1 and t 2 as a function of p, we observe the following regarding the influence of p on the maintenance planning: If the preventive maintenance cost (at both opportunities) is very small compared to the cost of corrective maintenance, the order of the total preventive maintenance cost incurred until a successful preventive maintenance compared to the corrective maintenance cost is still maintained. Therefore, the maintenance planning does not alter that much regardless of the value of p, where the optimal policy is to almost always perform preventive maintenance at USOs for all values of p ∈ [0.5, 1]. This also explains the small discrepancy between π opt and π opt in Table 2. This is different in the case of the second cost structure, where the maintenance planning changes substantially as a function of p. Whereas in the perfect case, the optimal policy is to perform preventive maintenance at a USO if the residual time until the next SO is larger than 0.39, for p 0.83, it is optimal to never perform preventive maintenance at a USO. Here, the order of the total preventive maintenance cost incurred until a successful preventive maintenance compared to the corrective maintenance cost is not maintained.
Also in the opposite cost structure, i.e. c uso pm < c so pm (similar examples can be found for c uso pm = c so pm ), the maintenance planning can be influenced significantly by the imperfect repair probability. For instance, consider the setting with µ 1 = 1.1, µ 2 = 0.9, c so pm = 4500, c uso pm = 4000, c cm = 10000, and λ = 0.5. In case of perfect repairs (i.e. p = 1), the optimal policy is to perform preventive maintenance in state 1 at both SOs and USOs, and to do nothing otherwise (cf. Theorem 2). However, if 0.72 p 0.83, the optimal policy is to only perform preventive maintenance at USOs and if p 0.72, then the optimal policy is to never perform PM. This example illustrates the influence of the imperfect repair probability on the maintenance planning.

Deferring of planned maintenance
In this section, we illustrate the change introduced by the action of deferring planned maintenance after the occurrence of a successful maintenance in three numerical examples that relate to the wind industry, the lithography industry, and to an artificially created example. Figure 5 shows the long-run rate of cost for both the deferral and no deferral case for the example with data stemming from the wind industry. Again, with regard to the cost parameters, we used c so pm = 1000, c uso pm = 2000 and c cm = 300000. With regard to the other parameters, we set λ = 4, τ = 1, µ 1 = 0.31, µ 2 = 0.31 and p = 0.6. We can observe that deferring the planned maintenance both significantly increases the long-run rate of cost under the optimal policy (an increase of 28.14% from 8468.87 to 10852.15) and changes the value connected to the optimal policy,t from 0.112 to 0.  respectively, based on the values of the lithography industry example. We use the same cost parameters as in Section 6.2, that is c so pm = 26500, c uso pm = 28800 and c cm = 75500. The other parameters remain unchanged, i.e. λ = 4, τ = 1, µ 1 = 0.31, µ 2 = 0.31 and p = 0.6. Again, we observe the same influence of deferring the planned maintenance on both the long-run rate of cost under the optimal policy (an increase of 6533.3 % from 12840.12 to 851727.53) and on the value oft associated with the optimal policy (from 1 to 0.175) , similarly to the numerical example for the wind industry. The drastic increase is due to the cost structure, and more explicitly, it is due to the preventive maintenance costs values (both at scheduled and unscheduled opportunities), which are relatively much closer to the corrective maintenance cost in comparison to the wind industry example.  To illustrate that the opposite effect (albeit to a much lesser degree than in the previous two examples) can also hold, we create an artificial example where we set c so pm = 5000, c uso pm = 10000 and c cm = 19000, and λ = 4,τ = 4,µ 1 = 1, µ 2 = 0.4 and p = 0.5. Figure 7 depicts the long-run rate of cost for both the deferral and the no deferral case for this example. Here we observe that for all values oft, cost savings can be obtained by deferring planned maintenance after the occurrence of a successful opportunistic maintenance.
More specifically, whereas the optimal value oft is equal to 1 for both cases, the long-run rate of cost under the optimal policy decreases with 0.88% from 6458.97 to 6402.44, when deferring planned maintenance.

Conclusion
In this paper, we considered the maintenance policy for a 3-state component degrading over time with corrective replacements at failures and preventive replacements at both scheduled and unscheduled opportunities under imperfect repair. By formulating this problem as a semi-Markov decision process, we were able to characterize the structure of the optimal maintenance policy as a control limit policy, where the control limit depends on the time until the next planned maintenance opportunity. Using this approach, a closed-form expression for the optimal control limit was derived. Within this class of control limit policies, we derived, using the theory of regenerative processes, an explicit expression for the long-run rate of cost. Using a similar approach based on renewal theory, we derived an expression for the long-run rate of cost in the case when planned maintenance is deferred after the occurrence of a successful opportunistic maintenance.
A cost comparison with other suboptimal policies has been examined, which illustrated the benefits of optimizing the maintenance policy. Specifically, it was found that incorporating planned maintenance can significantly reduce costs compared to only corrective maintenance, which can be reduced even further by adding opportunistic maintenance. Moreover, numerical results indicate that the extent of the impact of the perfect repair probability on the optimal policy depends on the underlying cost structure. It was also shown that substantial cost savings can be obtained by improving the perfect repair probability. Finally, our numerical examples indicate that the deferral of planned maintenance after the occurrence of a successful opportunistic maintenance may impact the total cost in both a negative and positive way.
There are a number of extensions and topics for future research. The most important direction is to consider the network dependency on the level of the structural degradation and failure dependencies, i.e., to consider a multi-dimensional process that captures the degradation of the various assets in the network.
Such a future direction would be particularly interesting in the case of a small number of assets for which the Poisson approximation for the opportunistic maintenance may not be accurate. In addition, another very interesting research direction would be to consider a more general model in which the condition of the system degrades through N > 2 states. Next, in this analysis, we have assumed that the condition of the system is fully observable. However, in many real applications, condition monitoring data such as spectrometric oil data or vibration data gives only partial information about the underlying state of the system. From this perspective, it would be interesting to extend the model at hand to a partially observable model in which the condition monitoring data are stochastically related to the true system state. Finally, the results in this paper are valid for systems with hypo-exponentially distributed lifetimes. Future research could relax this assumption by considering a phase-type lifetime distribution.

A Optimality equations for semi-Markov decision process
We consider the so-called ratio-average cost for a controlled semi-Markov decision process, which corresponds to the limes superior of the expected total costs over a finite number of jumps divided by the expected cumulative time of these jumps, see, Ross (1970);Feinberg (1984); Schäl (1992), for instance.
We shall use here the definition of a controlled semi-Markov decision process from Lippman (1975);Yushkevich (1982); Jaśkiewicz (2004). A controlled semi-Markov decision process is specified by five objects: a Borel state space S, a Borel action space, a law of motion -a measurable projection determining the state as a function of an action), a transition function (transition law) p -a probability measure depending measurably on the state and the action, and a reward (or cost) function c.
The process is observed at time t = 0 to be in some state x 0 = x ∈ S. At that time an action a 0 = a ∈ A x is chosen, where A x is a compact set of actions available in state x. The set of all actions is A and is also assumed to be a Borel state space. If the current state is x and action a is selected, then the immediate cost c 1 (x; a) is incurred, and the system remains in state x 0 = x for a random time T , with the cumulative distribution depending only on x and a. The cost c 2 (x; a) per time unit is incurred until the next transition occurs. Afterward, the system jumps to the state x 1 = y according to the probability measure (transition law) p(· | x, a). This procedure yields a trajectory (x 0 , a 0 , t 1 , x 1 , a 1 , t 2 , . . .) of some stochastic process, where x n is the state, a n is the control variable and t n is the time of the n-th transition, n = 0, 1, . . .. In the sequel, we shall refer to the corresponding random variables by means of their capital letters.
where, for brevity of notation, we denote P a x = P[· | x, a] depending on x ∈ S and on a ∈ A x and E a x [ · ] denotes the conditional expectation with respect to the measure P a x . Let It is assumed that (i) For each x ∈ S, the set A x is a compact metric space; (ii) For each x ∈ S, c(x; ·) is lower semicontinuous on A x ; (iii) For each x ∈ S and every Borel set D ⊂ S, the function p(D | x, ·) is continuous on A x ; (iv) For each x ∈ S, τ (x; ·) is continuous on A x , and there exists positive constants b and B such that b ≤ τ (x; a) ≤ B for all x ∈ S and a ∈ A x ; (v) There exist a constant L > 0 and a Borel measurable function V : S → [1, ∞) such that |c(x, a)| ≤ LV (x) for all x ∈ S and a ∈ A x ; (vi) For each x ∈ S, the function S V (y)q(dy | x, ·) is continuous on A x ; (vii) There exists a Borel set C ⊂ S such that for some λ ∈ (0, 1) and η > 0, we have S V (y)q(dy | x, a) ≤ λV (x) + ηı C (x) for all x ∈ S and a ∈ A x ; (viii) The function V is bounded on C; (ix) There exists some δ ∈ (0, 1) and a probability measure µ concentrated on the Borel set C with the property that q(D | x, a) ≥ δµ(D) for each Borel set D ⊂ C, x ∈ C and a ∈ A x . for all x ∈ S. Moreover, and with π = {π n }, n = 0, 1, . . . a (non-randomized) stationary policy.
Remark 1 (Boundness of value function). Note that for the value function at hand, the following always The upper bound provided here corresponds to the long-run rate of cost in the case of only corrective maintenance, see Section 4.2.1.

A.1 Proof of Theorem 1
Proof. The Bellman optimality equations using the ratio-average cost criterion for the model at hand can be derived using the theory briefly sketched in Appendix A and the references therein. More concretely, let V (i, j, t) be the value function when the state of the system is (i, j, t) ∈ S. In the notation of Section 4.1, the Bellman optimality equations (cf. equation of Proposition 2) assumes the following form More explicitly, for t ∈ [0, τ ), the Bellman optimality equations read as follows: . (28) We define the following auxiliary functions, for t ∈ [0, τ ), so that Equations (23)-(28) reduce to In this paragraph, we explain in detail how Equation (23) is obtained. State (2, SC, t), is not associated with any decision. Therefore, there is no minimum operator appearing on the right hand side of Equation (23) and the corresponding cost is equal to zero. For the other terms appearing on the right hand side of Equation (23), it suffices to note that there are three possible evolutions in terms of the state of the system: either an SO or an SC or a USO, where the time till the next SO is equal to t, while the times till the SC and USO are exponentially distributed with rates µ 1 and λ, respectively. In particular, the expected sojourn time of the semi-Markov decision process in state (2, SC, t) can be calculated as the expectation of the minimum of a deterministic time t and two exponentially distributed times, which can be easily verified to be equal to t 0 e −(µ1+λ)x dx. The set of optimality equations for the remaining states can be obtained using very similar arguments.
Note that in Equations (31) and (32), inside the minimum, the left term corresponds to the action 'perform preventive maintenance', while the right terms correspond to the action 'do nothing'.
We observe that, since c so pm , c uso pm > 0 and p + q = 1, Equations (31) and (32) yield that it is never optimal to perform preventive maintenance in state 2 in both USOs and SOs, respectively.
For the remainder of the proof, we distinguish four cases, each corresponding to a different set of actions.
Case (i): In state (2, SO, 0), it is optimal to not perform preventive maintenance. Furthermore, from the assumption and Equation (32) for i = 1, it becomes evident that it is also optimal to not perform preventive maintenance in state (1, SO, 0). Since the function F 1 (t) − F 2 (t) is, by definition, a continuous function in t ∈ [0, τ ], c so pm < c uso pm , and taking into account Equation (33), it is evident that there exists an ε > 0, such that Equation (34), in light of Equation (31), implies that if the elapsed time from the SO is less than , then, under the assumption it is optimal to not perform preventive maintenance on the system in state (i, SO, 0), it is also not optimal to perform preventive maintenance at a USO. In this case, for t ∈ (τ − , τ ], we have that V (1, USO, t) = F 1 (t) and V (2, USO, t) = F 2 (t), cf. Equation (31). Taking the derivative with respect to t in Equation (29) and substituting the above obtained values for V (1, USO, t) and V (2, USO, t) yields The solution to the above differential equation reads If F 1 (τ ) − F 2 (τ ) − µ1ccm µ1+µ2 = 0, it follows that, for t ∈ (τ − , τ ], the function F 1 (t) − F 2 (t) is strictly monotone. In this case, by extending the previous analysis to the entire domain, which would maintain the strict monotonicity of the function F 1 (t)−F 2 (t), we would reach a contradiction: For t = 0, Equation = denotes that the equality follows from Equation (·). We thus have Due to (36), it is evident that F 1 (τ ) − F 2 (τ ) − µ1ccm µ1+µ2 = 0, thus the function F 1 (t) − F 2 (t) satisfying Equation (35) is a constant function, i.e.
Combining Equation (33) with Equation (37) leads to the optimality condition for Case (i). That is, if we do not perform preventive maintenance at any opportunity.
Case (ii): In state (2, SO, 0), similarly to the previous case, it is optimal to not perform preventive maintenance. However, from the assumption and Equation (32) for i = 1, it becomes evident that it is optimal to perform preventive maintenance on the system in state (1, SO, 0). Similarly to Case (i), as F 1 (τ ) − F 2 (τ ) < c uso pm p , there exists an ε > 0 for which (34) holds.
Repeating the same analysis as in Case (i), we can show that, for t ∈ [0, τ ], the function F 1 (t) − F 2 (t) satisfies Equation (35) and that it is a non-decreasing function if However, for t = 0, we now have that Combining (40) with (35) (on the domain t ∈ [0, τ ]) yields Combining Equations (38), (39), and (41) leads to the optimality condition for Case (ii). That is, if we perform preventive maintenance on the system if it is in state 1 at an SO but not at a USO.

D Proof of Theorem 4
Proof. We first focus on the derivation of the cycle length appearing in the denominator of Equation (8).
Observe that the length of a renewal cycle consists of the time the system spends in state 2 plus the time from the state-change 2 → 1 until the first successful maintenance. To this purpose, let CL denote the length of the part of the renewal cycle that the underlying stochastic process spends in state 1. Furthermore, let Y denote the random amount of time from a state-change 2 → 1 to the first SO, we then have for the probability density function of Y that f Y (y) = f Tµ 2 (τ − y|T µ2 < τ ), which leads to Equation (13). Conditioning on Y , a renewal cycle can either end before the first SO, or at the first SO, or after the first SO. Hence, we have that the expected cycle length is equal to We first focus on deriving expressions for the individual expectations in Equation (52). Note that the first successful maintenance can be of type j ∈ {SC, SO, USO} and may occur in the interval [t, t ], this is in short denoted by j [t, t ]. Thus, rewriting the first part in Equation (52) results in (c.f. Equation (9)) For the second expectation in Equation (52), observe that the length of this part can be further decomposed: first the system goes through a geometric number of intervals of length τ in which no successful maintenance activity takes place, after which the system enters the last interval in which the successful maintenance activity takes place. To this end, let p u be the probability that there is no successful maintenance activity in an arbitrary interval between two SOs (including the SO with which this interval ends) after the state change 2 → 1, i.e. We then have, from the memoryless property of T µ1 and T λp , where E CL ı {CL ≤Y } | Y = τ is the expected length of the last part of the renewal cycle, i.e. the interval in which the successful maintenance activity takes place. Analogously to Equation (53), we can further decompose E CL ı {CL ≤Y } | Y = τ by conditioning on the type of the successful maintenance activity with which it ends.
We are now left with defining the events that lead to j [t, t ], such that we can calculate the expectations in Equations (17) (14). Equations (15) and (16) are obtained along similar lines. Note that all expectations and probabilities only involve exponentially distributed random variables. Consequently, closed-form expressions can be obtained using straightforward calculus. However, for the sake of brevity, we have chosen to provide one closed-form expression and omit the rest (which can be obtained analogously). For Equation (17) 1 − e −(λp+µ1)(y−t) (1 + (λp + µ 1 )(y −t)) λp + µ 1 .
We now focus on the numerator of Equation (8), i.e. the expected cycle cost. To that end, let CC be the cost incurred in a renewal cycle. The analysis for the expected cycle cost, E [CC], is similar to the analysis of the expected cycle length. Again, we decompose the length of a renewal cycle into three parts (i.e. the interval after the state change until the first SO, the geometric number of intervals of length τ in which no successful maintenance activity takes place, and the last interval in which the successful maintenance activity takes place), and compute the conditional expected cycle costs in these parts (mainly consisting of costs incurred at unsuccessful maintenance activities). Thus, We first focus on the first part in Equation (55) and condition further on the type of activity, which yields Analogous to the expected cycle lengh, the expected cost incurred during the geometric number of intervals of length τ , in which no successful maintenance activity takes place, is equal to ∞ k=0 p k u (1 − p u )k λ(1 − p)(τ −t)c uso pm + c so pm = (λ(1 − p)(τ −t)c uso pm + c so pm )p u 1 − p u .
Observe that the expected cost in the interval in which the successful maintenance activity takes place is composed of two parts regardless of the type of activity, i.e. the cost of the successful maintenance activity itself and the cost related to the unsuccessful USOs up to the successful maintenance activity (see Equations (20) - (22)). Again, all expectations and probabilities related to the costs only involve exponentially distributed random variables, and again, for the sake of brevity, we have chosen to provide one closed-form expression and omit the rest (which can be obtained analogously). For Equation (21), we have E CC ı {SO[τ −y,τ ]} = c so pm + λ(1 − p)c uso pm max y −t, 0 P SO[τ − y, τ ] , with P SO[τ − y, τ ] = P T µ1 > y ı {y<t} + P T λp > y −t, T µ1 > y ı {y≥t} = e −µ1y ı {y<t} + e −(µ1y+λp(y−t)) ı {y≥t} , which completes the proof.