1 Introduction

At present, the maintenance mode of the railway traction power supply system (TPSS) in China is a combination of scheduled maintenance, fault repair and daily inspection, among which the scheduled maintenance works as the dominant maintenance method. Such mode can to some extent guarantee the reliability of TPSS, while it brings about problems of high costs, frequent maintenance and low efficiency [1,2,3]. Since the maintenance activities of equipment rely on manual operation, the entire network or area needs to be cut off power when equipment is under maintenance. Therefore, frequent maintenance increases the power outage time of the system, which may not only affect the operation plans of existing trains, but also sabotage the “all-weather operation” of the railway and reduce the transportation capacity [4]. Thus, it is of great importance to investigate how to improve the maintenance efficiency of the TPSS and reduce the system power outage time.

Current maintenance mode of TPSS in China is mainly guided by the maintenance specifications formulated by China State Railway Group Co., Ltd., and according to such specifications, detailed maintenance schemes for a specific TPSS are then developed considering practical situations. The maintenance schemes are formulated on annual, monthly, and daily basis. Firstly, the annual maintenance schemes are formulated according to maintenance specifications of TPSS. Then those schemes are divided and specific maintenance tasks are arranged for each month and each week. Finally, a detailed maintenance work ticket for specific equipment is filed, which specifies detailed procedures to implement the maintenance tasks. Since the equipment operation condition may change during its operation, a routine inspection is performed in addition to scheduled maintenance, and the equipment operation condition is then obtained to adjust maintenance scheme accordingly. Meanwhile, the maintenance mode of emergency repair is also included: when the equipment endangers the safe operation of the TPSS, immediate maintenance tasks are performed [5]. In conclusion, at present the scheduled maintenance mode, which is devised according to maintenance specifications of railway administration, is mainly adopted in TPSS.

Regarding the problems of high costs, frequent maintenance, and low efficiency of scheduled maintenance mode, many scholars have explored the maintenance modes of various systems. Huang et al. [6] proposed a real-time maintenance method for the machine tools on the production line, and established a data-driven model of a serial production line for analysing production dynamics, which effectively reduced the maintenance costs of machine tools. Chen et al. [7] established a state-based adaptive maintenance model for degraded systems with different operating conditions, and the optimal maintenance cost allocation and maintenance threshold was determined by the maximum availability function. Ref. [8] optimizes the maintenance of the multi-state manufacturing system. In view of the limitation of maintenance resources, a maintenance optimization model considering uncertain maintenance effect is proposed. This model solves the optimal maintenance strategy under the constraints of maintenance time and costs. Afzali et al. [9] studied the maintenance strategy of substation equipment, and established the indicator model of equipment life efficiency. Then a method for optimal maintenance of substation equipment was proposed. Under the condition of guaranteeing the reliability of equipment, Cuckoo optimization algorithm was adopted to solve the maintenance strategy with the minimum maintenance costs. The above studies mainly explore the maintenance mode of the system with the goal of reducing the maintenance costs. However, the system downtime during maintenance is not included, which may result in the over-long power outage time and affect the normal operation of the system.

For current scheduled maintenance mode applied in TPSS, the maintenance of equipment is independent of each other and coordination of different equipment is not considered. Since equipment reliability decreases with the accumulation of service time, and the degradation of the equipment follows a certain pattern, we consider that the maintenance of multiple equipment can cooperate to be repaired simultaneously to improve efficiency. To achieve this, Berg proposed the concept of opportunistic maintenance strategy [10]. Such strategy is defined that: when one piece of equipment in the system failed, if the service life of another piece of equipment exceeds a preset limit, both will be repaired or replaced at the same time [11,12,13]. Zhang et al. [14] and Atashgar et al. [15] made use of the system downtime during maintenance, and advanced the maintenance time of some equipment. This sort of simultaneous maintenance mode of multiple equipment effectively reduced maintenance costs. Wang et al. [16] researched the preventive maintenance strategy of the component-layer of the electric multiple units. By combing the maintenance operation plans of related components within the opportunistic maintenance mileage, it effectively reduced the number of system maintenance downtime. Xie et al. [17] considered the special working environment of wind turbines and adopted the opportunistic maintenance strategy. By optimizing the preventive maintenance age and the opportunistic maintenance age, the system maintenance costs are significantly reduced. In the above studies, the opportunistic maintenance strategy is applied to optimize the maintenance costs of the system, but the reliability requirements of the equipment are not considered during the optimization process. To solve this problem, Zhao et al. [18] considered the reliability of each component, and introduced the concept of opportunistic maintenance based on the preventive maintenance mode. By cooperating the maintenance of multiple components, the total costs are significantly reduced. Besnard et al. [19] proposed a strategy based on opportunistic maintenance for wind turbine components, and utilized the golden section method to optimize the reliability of opportunistic maintenance. However, the same reliability measurement was adopted for all the components, and yet it failed to reflect the different reliability requirements of different components. In view of this, Zhao et al. [20] proposed a preventive opportunistic maintenance (POM) strategy for wind turbines considering different reliability requirements of each component. The reliability maintenance margin was introduced to optimize the opportunistic maintenance interval of components to obtain the minimum maintenance costs. But such margin was considered identical for all the components, and thus the differences of the reliability degradation curves for each component cannot be appropriately described.

In order to deal with the aforementioned problems, this paper proposes a POM method for TPSS based on equipment reliability. Firstly, the Weibull distribution is used to characterize the reliability degradation process of the equipment. Then a POM model for the TPSS is established, and the optimal maintenance scheme is achieved by minimizing the total power outage time during the planned operation cycle of the TPSS. The contributions of this paper are as follows:

  1. 1.

    The proposed method makes full use of the historical fault data of equipment in TPSS, and introduces the Weibull distribution to characterize the reliability function of the equipment. By combining the current maintenance cycle of the equipment, equipment reliability in accordance with actual requirement is estimated.

  2. 2.

    It considers the differences in the degradation processes of different types of equipment. By setting corresponding reliability maintenance margin, the opportunistic maintenance interval of each piece of equipment is then calculated and the simultaneous maintenance among multiple equipment can be achieved as far as possible within the planned operation cycle.

  3. 3.

    Through the proposed method, POM of power supply equipment in TPSS can be realized. System downtime is reduced though simultaneous maintenance of multiple equipment. Compared with the traditional fixed cycle preventive maintenance mode, the proposed method can effectively reduce power outage time and improve maintenance efficiency.

The paper is organized as follows: In Sect. 2, the POM theory is explained; Weibull distribution is introduced to characterize the reliability degradation process of equipment; the reliability maintenance margin is defined to describe the opportunistic maintenance interval of the equipment. Section 3 establishes the POM model for TPSS by minimizing the total power outage time during the planned operation cycle. Section 4 presents a cases study: based on an actual TPSS, the minimum power outage time under different reliability maintenance margins is solved. Then the optimal maintenance scheme is determined according to the stepwise distribution of power outage time. Finally, Sect. 5 presents the conclusions of this paper.

2 POM theory based on equipment reliability

A conceptual framework is presented to clearly demonstrate the work of this paper, as shown in Fig. 1. Based on the historical fault data, the Weibull distribution is used to describe the equipment reliability. With the POM method, the optimal maintenance strategy for the TPSS is formulated by minimizing the total power outage time in the planned operation cycle under the constraint of equipment reliability requirement. Such strategies are then used to guide on-site maintenance work.

Fig. 1
figure 1

Framework of the POM for TPSS

The reliability of power supply equipment in TPSS decreases with service time, and the reliability function follows a certain probability distribution. TPSS consists of different types of equipment, and each type shows distinctive failure period and failure process; therefore it is difficult to describe the reliability degradation processes of the equipment using simple exponential distribution or normal distribution [21, 22]. While Weibull distribution can well reflect the impact of equipment defects and daily deterioration on its reliability since such distribution is obtained on the basis of the weakest chain model or series model [23]. This paper introduces the two-parameter Weibull distribution to describe the degradation trend of the equipment [24, 25]. The reliability function is presented as follows:

$$ R(t) = \exp \left[ { - \left( {\frac{t}{\eta }} \right)^{m} } \right],\quad t > 0 \, , $$
(1)

where η represents the scale parameter and m the shape parameter. Combining with the historical fault data of the equipment, the two parameters are calculated according to the least square method. Figure 2 shows the reliability degradation curve described by the Weibull distribution [20].

Fig. 2
figure 2

Reliability degradation curve

The difference of reliability value between two time points is defined as the reliability maintenance margin, which describes the degradation trend of the equipment [20]. As shown in Fig. 2, the difference between reliability R1 and R2, denoted by ΔR, is the reliability maintenance margin in this case. Meanwhile, ΔT represents the length of interval [T1, T2], which refers to the operation time related to reliability R1 and R2. According to the slope of the reliability degradation curve at T2, denoted by k, the relationship between ΔR and ΔT can be expressed as follows:

$$ \begin{aligned} \left| k \right| = \frac{\Delta R}{\Delta T} & = R(T_{2} )^{{\prime }} \\ & = \frac{m}{\eta }\left( {\frac{{T_{2} }}{\eta }} \right)^{m - 1} \exp \left[ { - \left( {\frac{{T_{2} }}{\eta }} \right)^{m} } \right] \\ & = \frac{m}{\eta }\left( {\frac{{T_{2} }}{\eta }} \right)^{m - 1} R(T_{2} ). \\ \end{aligned} $$
(2)

Given ΔR, the opportunistic maintenance interval ΔT can be determined through Eq. (2); besides, Eq. (2) shows that the length of the opportunistic maintenance interval ΔT can be adjusted by changing the reliability maintenance margin ΔR. Due to differences in the reliability function of different equipment, for the same ΔR, the value of ΔT may be different. When the slope of the reliability curve increases, the value of ΔT decreases for a fixed ΔR. While if the slope of that curve decreases, ΔT increases.

According to the relationship between ΔR and ΔT, POM is presented: a minimum reliability Rp is determined for each piece of equipment, and the corresponding operation time is regarded as the preventive maintenance time Tp. The time prior to Tp by ΔT is defined as the POM time, denoted as To; the interval ΔT at this time is the opportunistic maintenance interval, and the POM can be performed on the equipment within this interval. In this manner the maintenance time of the equipment is extended from a single time point to a time period. This allows the maintenance of equipment to cooperate with each other; thus the simultaneous maintenance can be realized.

In actual maintenance, the equipment performance may not be fully restored after maintenance [20]. In order to characterize the impact of the latest maintenance on the current equipment operation, a repair factor λ (0 < λ ≤ 1) is introduced to describe the state of the equipment. When λ = 1, it means that the equipment is “fully recovered” in the latest repair, and we deem the equipment operation time returns to zero. When 0 < λ < 1, it means that the equipment performance “restores to certain degree”, and the operation time is considered to fall back by a certain period. The POM method is presented by analysing typical power supply equipment operating in a TPSS [26], schematic diagram shown in Fig. 3.

Fig. 3
figure 3

Schematic diagram of preventive opportunistic maintenance (POM)

For equipment a, b, and c, the preventive maintenance interval for each are denoted by Ta, Tb and Tc, and the opportunistic maintenance interval are [To(a), Tp(a)], [To(b), Tp(b)] and [To(c), Tp(c)], respectively. During the operation of TPSS, the POM strategy works as follows:

  1. (a)

    When the operation time of equipment a reaches the preventive maintenance time Tp(a1), the first-round of maintenance is performed for the system: equipment a undergoes the preventive maintenance; the POM is considered for equipment b since the opportunistic maintenance interval of equipment b overlaps with that of equipment a (shown as the shaded part in Fig. 3); while equipment c obtains no opportunity of maintenance since its opportunistic maintenance interval does not overlap with that of equipment a. Equipment a and b are fully recovered subsequent to this round of maintenance, and the operation times of both are considered as zero.

  2. (b)

    When the operation time reaches Tp(c1), the second-round of maintenance is performed. It is noted that the opportunistic maintenance intervals of equipment do not overlap, and therefore maintenance is only performed for equipment c.

  3. (c)

    In the third-round of maintenance, equipment a and b are repaired simultaneously for the opportunistic maintenance interval of them overlapped.

  4. (d)

    Similarly, in the fourth-round of maintenance, equipment a and b undergo POM during the preventive maintenance of equipment c.

By implementing POM, the number and duration of maintenance downtime can be effectively reduced. Each time the equipment is under maintenance following its preventive maintenance plans, a certain period of maintenance downtime is required. However, by applying POM, when there is equipment under preventive maintenance, other equipment can be repaired simultaneously during the maintenance downtime. In this manner, during this single power outage period, multiple pieces of equipment are repaired, which significantly reduces the total power outage time and improves maintenance efficiency.

3 POM model for TPSS

3.1 The composition of power outage time

Due to the particularity of the TPSS, we desire to reduce the power outage time as much as possible while ensuring the reliability of the equipment during maintenance activities. In the formulation of maintenance strategies for TPSS, power outage time has become an important indicator. The power outage times of the TPSS for preventive maintenance and POM mode are different. The power outage time includes the following two parts:

  1. 1.

    Maintenance operation time: The time that maintenance personnel spends on repairing the equipment, denoted as Tmaint.

  2. 2.

    Power supply scheduling time: The time prior to power restoration after the maintenance is completed, denoted as Tps.

For the power supply equipment in TPSS, the power outage time of preventive maintenance Tpmot includes maintenance operation time Tmaint and power supply scheduling time Tps, while the power outage time of POM Tomot only includes maintenance operation time Tmaint.

3.2 POM model

For a TPSS that consists of n pieces of power supply equipment, when the reliability of equipment k degrades to the minimum reliability Rp(k), preventive maintenance is required to maintain the reliability of equipment k. The corresponding preventive maintenance time is denoted as Tp(k). As for the preventive maintenance mode, when equipment reliability degrades to Rp(k), preventive maintenance is performed. If equipment k is fully recovered after maintenance, it will be repaired when its reliability degrades to Rp(k) again. The preventive maintenance model is as follows:

$$ \left\{ {\begin{array}{*{20}l} {M_{{\text{p}}(k)} = T/T_{k} } \hfill & {(k = 1, 2, \ldots ,n),} \hfill \\ {T_{\text{ptot}} = \sum\limits_{k = 1}^{n} {\left[ {T_{{{\text{pmot(}}k)}} M_{{\text{p}}(k)} } \right],} } \hfill & {} \hfill \\ \end{array} } \right. $$
(3)

where Mp(k) is the number of preventive maintenances of equipment k; T is the planned operation cycle of the TPSS; Tk is the preventive maintenance cycle of equipment k; Tptot is the total power outage for preventive maintenance of the TPSS; Tpmot(k) is the preventive maintenance power outage time of equipment k.

In the preventive maintenance model, the maintenance of equipment is independent of each other and the coordination of multiple equipment is not considered. As a result, the total power outage time can be long when several pieces of equipment need maintenance. While for the POM method, when preventive maintenance is being performed for equipment k due to its reliability degradation to Rp(k), other equipment of which the preventive maintenance time approaches that of equipment k can be repaired simultaneously. The POM model is as follows:

$$ { \hbox{min} }\left\{ {T_{\text{otot}} = \sum\limits_{i = 1}^{z} {\left[ {{ \hbox{max} }\{ T_{{{\text{pmot}}(k)}} ,\;T_{{{\text{omot}}(h)}} \} \, (k,h = 1, 2, \ldots ,n)} \right]} } \right\}, $$
(4)

s.t.

$$ \left\{ {\begin{array}{*{20}l} {T_{{\text{p}}(k)} - T_{{\text{o}}(k)} = \frac{{\Delta R_{{^{(k)} }} }}{{\frac{{m_{{^{{_{(k)} }} }} }}{{\eta_{{^{(k)} }} }}\left( {\frac{{T_{{\text{p}}(k)} }}{{\eta_{{^{(k)} }} }}} \right)^{{m_{{^{(k)} }} - 1}} R_{{\text{p}}(k)} }}} \hfill & {(k = 1,2, \ldots ,n) \, ,} \hfill \\ {\lambda R\left( {T_{{\text{p}}(k)} } \right) = R_{\text{p}(k)} } \hfill & {(k = 1,2, \ldots ,n) \, ,} \hfill \\ {\lambda R\left( {T_{{\text{o}}(k)} } \right) = R_{{\text{o}}(k)} } \hfill & {(k = 1,2, \ldots ,n),} \hfill \\ {z = \sum\nolimits_{k = 1}^{n} {M_{{\text{p}}(k)} \, .} } \hfill & {} \hfill \\ \end{array} } \right. $$
(5)

The objective function of the model is shown in Eq. (4), where Totot is the total power outage time of POM; Tomot(h) is the POM power outage time of the equipment h; z is the number of preventive maintenances within the planned operation cycle of the TPSS. In each preventive maintenance, compare the power outage time of preventive maintenance (Tpmot(k)) with that of POM (Tomot(h)), and select the longer time as the standard for this maintenance. Finally, by adding up the power outage time of all preventive maintenances the total power outage time of POM can be obtained.

The constraints of the model are shown in Eq. (5). The slope of the reliability function is combined with the opportunistic maintenance interval, and the reliability maintenance margin is used to characterize the opportunistic maintenance interval. Different reliability maintenance margins are specified for different equipment, and through adjusting such margins, the opportunistic maintenance interval of each piece of equipment can be overlapped to the greatest extent. Thus, simultaneous maintenance among equipment can be maximized to achieve the optimal maintenance strategy of TPSS. At the same time, the repair factor λ is used to characterize the impact of this maintenance on the next operation of equipment. The reliability function and reliability requirement of the equipment are combined to ensure the rationality of the opportunistic maintenance interval.

3.3 POM decision process

POM decision for TPSS is an iterative optimization process. Firstly, according to the reliability maintenance margin ΔR(k) of each piece of equipment, combining with the POM method, the preventive maintenance times Mp(k) and POM times Mo(k) of each are counted until the operation time of the TPSS reaches the planned operation cycle. Then, through the objective function of the POM model, the total power outage time of TPSS can be obtained. The total power outage times of the TPSS corresponding to the different reliability maintenance margin ΔR(k) of each piece of equipment are compared to obtain the minimum power outage time. Finally, the maintenance scheme corresponding to the minimum power outage time is obtained, including equipment reliability maintenance margin ΔR(k), maintenance time, and the specific equipment that needs to be repaired at each maintenance time. The flow chart is shown in Fig. 4.

Fig. 4
figure 4

The flow chart of POM strategy for TPSS

The POM decision process of TPSS is mainly divided into four steps as follows.

Step 1: Input data. It mainly includes the planned operation cycle T of the TPSS, historical fault data, reliability requirements and reliability maintenance margins of n pieces of equipment. In addition, the parameters related to power outage time are also obtained.

Step 2: Acquisition of equipment maintenance interval. The Weibull distribution is used to characterize the reliability function of the equipment, and the POM method is applied to the TPSS. The maintenance interval of each piece of equipment is determined by the reliability maintenance margin.

Step 3: Establishment of decision model for POM strategy. After determining the reliability maintenance margin of each piece of equipment, preventive maintenance or POM is performed for the equipment according to the cooperation of the opportunistic maintenance interval of such equipment, and the preventive maintenance times Mp(k) and POM times Mo(k) of the equipment are counted until the operation time of the TPSS reaches the planned operation cycle. Finally, according to the power outage time parameters, the corresponding system power outage time, maintenance time and the equipment that needs to be repaired at each maintenance time are obtained.

Step 4: Formulation of the optimal maintenance scheme. Update the reliability maintenance margin of each piece of equipment until such margin of all equipment meets the requirements. Compare the system power outage time under different reliability maintenance margins, and choose the maintenance scheme corresponding to the minimum power outage time, which is the optimal maintenance scheme for the TPSS.

4 Case study

4.1 Input data of model

4.1.1 Fault data of power supply equipment in TPSS

Typical power supply equipment in TPSS including transformer, circuit breaker and disconnector is introduced for the case study. The statistical data of these power supply facilities of the same batch under the same technical specifications were collected. These data were acquired from a power supply section in China Railway Administration from June 2016 to June 2019. The fault tables of equipment can be obtained after filtering and sorting the fault data, as given in Tables 1, 2 and 3.

Table 1 Transformers fault data
Table 2 Circuit breakers fault data
Table 3 Disconnectors fault data

Based on the fault data of the power supply equipment, the least square method is used to solve the Weibull parameters of each piece of equipment. Then the reliability functions of the transformer, circuit breaker and disconnector are, respectively, obtained as follows:

$$ R_{1} (t) = \exp \left[ { - \left( {\frac{t}{2.01}} \right)^{2.99} } \right], $$
(6)
$$ R_{2} (t) = \exp \left[ { - \left( {\frac{t}{1.33}} \right)^{3.04} } \right], $$
(7)
$$ R_{3} (t) = \exp \left[ { - \left( {\frac{t}{1.46}} \right)^{3.15} } \right]. $$
(8)

Reliability degradation curves of these types of equipment are shown in Fig. 5.

Fig. 5
figure 5

Reliability degradation curves of equipment

The reliability degradation curves of the equipment in Fig. 5 describe the degradation law of equipment of the same batch under the same technical specifications, which is not applicable to each piece of independent equipment. If such batch of equipment is not maintained during operation, the reliability degradation law can be described by the curves above.

4.1.2 Parameter setting for each type of equipment

In the case study part, by referring to the “High-speed Railway Traction Substation Operation and Maintenance Regulations” and combining the actual situation of TPSS, the planned operation cycle is regarded as 20 years [27]. According to the current maintenance rules for TPSS, the specified maintenance cycles for transformer, circuit breaker and disconnector are 1 year, but the actual maintenance cycle can be expanded and contracted by 15% on this basis [5]. Based on this, the preventive maintenance cycle of each piece of equipment is adjusted so as to better realize POM of the power supply equipment. As the key equipment of the TPSS, the transformer’s design and manufacturing shall meet strict requirements. It is highly reliable and equipment failure hardly occurs. Therefore, in order to avoid excessive maintenance of the transformer, its maintenance cycle is set to 1.15 years. Considering the frequent action of the circuit breaker, it is prone to failure during operation [28]. To avoid insufficient maintenance of the circuit breaker, the maintenance cycle is set to 0.85 years. Because the differences in the preventive maintenance cycle of each piece of equipment can better reflect the advantages of POM, the maintenance cycle of the disconnector is considered as 1 year. Given the maintenance cycle of each piece of equipment, combined with the equipment reliability function R(t) represented by the two-parameter Weibull distribution in Eq. (1), the reliability requirement of the equipment in actual operation can be obtained.

Table 4 gives the Weibull distribution parameters of the equipment, reliability requirements, maintenance operation time, and power supply scheduling time.

Table 4 Related data for equipment

In the case study, the equipment features high reliability requirement and long planned operation cycle. Multiple iterations will be carried out during the maintenance decision; thus the value of repair factor λ has a huge impact on the outcome of the decision. In order to facilitate the simulation and calculation of the model, in this paper we consider the repair factor λ = 1.

4.2 The solution process of POM model

Given the reliability maintenance margin of the equipment, ΔR(1), ΔR(2) and ΔR(3), the opportunistic maintenance interval [To(k), Tp(k)] of each piece of equipment can be obtained. Since the value of ΔR(k) determines the opportunistic maintenance time of the equipment and meanwhile ΔR(k) is a non-zero value according to the definition of POM, the minimum value of ΔR(k) is set to 0.02. In addition, according to the results of multiple tests, to avoid over-repair of the equipment, we set the maximum value of ΔR(k) as 0.2 to limit the opportunistic maintenance interval for the equipment. According to the flow chart of POM strategy, the POM times Mo(k) and preventive maintenance times Mp(k) of each piece of equipment can be obtained. Combining with the objective function of the model, the total power outage time Totot of POM can be obtained. Finally, the distribution of total power outage time is represented by the scatter diagram, as shown in Fig. 6.

Fig. 6
figure 6

The total power outage time corresponding to the maintenance margin of each piece of equipment

In Fig. 6, the coordinate (ΔR(1), ΔR(2), ΔR(3)) represents the reliability maintenance margin of transformer, circuit breaker and disconnector, respectively. The colour of each point in the space reflects the total power outage time of the TPSS. As shown in the figure, the distribution of power outage time is block-shaped. When ΔR(1), ΔR(2) and ΔR(3) change within a certain range, the corresponding power outage time remains the same. When ΔR(1) and ΔR(3) change as independent variables, respectively, the power outage time is decreasing; when ΔR(2) changes, the power outage time remains unchanged. The influence of setting ΔR(1), ΔR(2) and ΔR(3) each as independent variable on the total power outage time of TPSS is discussed below.

Figure 7 depicts that when the reliability maintenance margin of the transformer ΔR(1) is set as independent variable, the power outage time decreases from 67.14 h. When ΔR(1) > 0.13, the minimum power outage time is 48.72 h. When the reliability maintenance margin of the circuit breaker ΔR(2) is set as independent variable, the minimum power outage time in the system is 48.72 h. When the disconnector reliability maintenance margin ΔR(3) is set as independent variable, the minimum power outage time of the system decreases from 72.9 h. When ΔR(3) > 0.11, the minimum power outage time is 48.72 h.

Fig. 7
figure 7

Minimum power outage time curves of each piece of equipment

The curves of the minimum power outage time of the transformer and the disconnector have the same trend, both showing a downward trend. For a small value of ΔR(k), corresponding opportunistic maintenance interval ΔT(k) is small. In this case it is similar to the normal preventive maintenance, and the advantages of POM cannot be fully expressed. This results in a longer power outage time. When ΔR(k) increases, the corresponding opportunistic maintenance interval ΔT(k) increases. There is a good chance that the opportunistic maintenance interval among different equipment overlap; thus it is more likely to implement simultaneous maintenance and reduce power outage time. The curve of the minimum power outage time for the circuit breaker is a straight line. This is due to that the circuit breaker compared with other equipment, has the shortest preventive maintenance cycle and undergoes the most preventive maintenance. The cooperation among equipment has been maximized, and the change of ΔR(k) will not lead to any change of power outage time. The power outage time of the TPSS is determined by the preventive maintenance times Mp(k) and the POM times Mo(k) of each piece of equipment. Mp(k) and Mo(k) are affected by the reliability maintenance margin ΔR(k), but will not change in real-time with the change of ΔR(k). Therefore, the maintenance times corresponding to ΔR(k) in a certain range is fixed; that is, the power outage time remains unchanged. As a result, the distribution of the power outage time is block-shaped in Fig. 6 and the power outage time curves in Fig. 7 are stepwise changed.

According to the minimum power outage time curves corresponding to the reliability maintenance margin of the three types of equipment, the minimum power outage time in the three curves is 48.72 h. From this, it can be determined that the minimum power outage time of the TPSS in this case is 48.72 h. Because the power outage time distribution of the TPSS is block-shaped, the minimum power outage time does not correspond to a single value of the reliability maintenance margin. By analysing Fig. 6, the equipment reliability maintenance margin corresponding to the minimum power outage time of the TPSS can be represented in Fig. 8.

Fig. 8
figure 8

Range of equipment reliability maintenance margin corresponding to the minimum power outage time

The shaded part in Fig. 8 is the range of equipment reliability maintenance margin corresponding to the minimum power outage time. It is worth noting that the values of the reliability maintenance margin of the circuit breaker and that of the disconnector, denoted by ΔR(2) and ΔR(3) respectively, are not completely continuous in Fig. (b). When 0.16 ≤ ΔR(2) ≤ 0.20, the value of ΔR(3) is limited to [0.10, 0.14] and [0.18, 0.20]. Apart from the above conditions, the values of the reliability and maintenance margin of each piece of equipment are continuous. The values of ΔR(k) in the shaded part of the figure correspond to the minimum power outage time of the TPSS. The optimization results of the POM strategy are explained in Table 5.

Since a larger value of the reliability maintenance margin ΔR(k) corresponds to a larger opportunistic maintenance interval ΔT(k), and it is more likely to implement simultaneous maintenance. Therefore, each piece of equipment obtains the maximum maintenance interval when ΔR(k) = 0.20, the corresponding maintenance reliability interval is given in Table 5, which is in line with the current maintenance reliability requirements.

Table 5 Optimization results of the POM strategy (when 0.16 ≤ ΔR(2) ≤ 0.20, ΔR(3) ≠ [0.14, 0.18])

4.3 Analysis of maintenance strategy results

The minimum power outage time of the TPSS is calculated for the POM model. Although the minimum power outage time corresponds to multiple sets of reliability maintenance margin ΔR(k), the final maintenance scheme is consistent. The specific maintenance scheme for POM is shown in Fig. 9.

Fig. 9
figure 9

Specific maintenance scheme for POM of TPSS

Figure 9 shows that during the planned operation cycle of the TPSS, each piece of equipment has been repaired 24 times, of which only the circuit breaker underwent preventive maintenance, and the transformer and disconnector only underwent POM. This is because the differences among the preventive maintenance cycle of transformer, circuit breaker and disconnector are small. When the preventive maintenance time of each piece of equipment is advanced, it is easy to achieve simultaneous maintenance among the equipment, thereby saving more power outage time. In current maintenance mode of TPSS, the preventive maintenance method is mainly adopted for power supply equipment, and the preventive maintenance times and total power outage time of each piece of equipment can be obtained through preventive maintenance models. In Table 6, the results of optimal decision-making for POM method are compared with the results of preventive maintenance.

Table 6 Comparison of preventive maintenance and preventive opportunistic maintenance (POM) results

In Table 6, the total number of preventive maintenances for the POM method is 24, which is 36 times less than the total number of 60 times without considering the POM. Although the maintenance number for each piece of equipment in POM has increased, multiple pieces of equipment can be maintained simultaneously under certain circumstances, and the total power outage time of the TPSS has decreased significantly. The power outage time has been reduced from 104.21 to 48.72 h, and the POM strategy has saved 55.49 h compared to preventive maintenance. The time saved mainly includes the power supply scheduling time and the maintenance time of transformer and disconnector during preventive maintenance of the equipment, which makes up 53.2% of the total power outage time. Obviously, the POM strategy can make full use of the maintenance operation time and reduce the total power outage time of the TPSS.

5 Conclusions

In this paper, a preventive opportunistic maintenance method based on equipment reliability for the traction power supply system is proposed, with the goal of minimizing the total power outage time during the planned operation cycle of the TPSS. By applying the fault data of transformer, circuit breaker and disconnector in an actual power supply section as input, the POM model is verified and analysed. Following conclusions are obtained:

  1. 1.

    The proposed method uses the Weibull distribution to characterize the reliability function of the power supply equipment and obtains the opportunistic maintenance interval by defining the reliability maintenance margin, which can implement the POM for the equipment while ensuring the reliability requirement.

  2. 2.

    POM method, by implementing simultaneous maintenance among multiple pieces of equipment, significantly reduces the number of system downtimes. It provides a good basis for the operation and maintenance unit to formulate equipment maintenance programs and achieve simultaneous maintenance.

  3. 3.

    The proposed method takes into account the maintenance correlation among different power supply equipment, which not only improves the efficiency of the current scheduled maintenance mode, but also saves power outage time.