Introduction

Maintenance planning is one of the main factors for production with the purpose of reducing machine downtime and total cost. It would result in increased manufacturing speed and quality of products. Different maintenance planning models have been studied in the literature review [7, 8, 11, 18, 24, 26, 30, 31, 33, 34].

In fact, maintenance activities are not perfect. One of the main reasons for the imperfection of maintenance activities is human error incidence during work [19]. Maintenance operator’s error is unavoidable. The occurrence of human error in maintenance may have an effect on safety and system efficiency. For example, poor maintenance can play an important role in increasing the number of equipment failures. Dhillon and Liu [9] explained the important role of humans in reliability of equipment. They published a literature review on human errors maintenance. Yang and Dhillon [37] provided an approach to analyse the level of availability in human-machine system. In the proposed method, different types of distribution functions such as Gamma, Weibull and Exponential are considered for repair time. Kumar and Gandhi [20] presented quantification of human error in maintenance by using graph theory and matrix approach. The approach played a significant role in identifying sources of human errors and predicting their impact. In this study, it is assumed that the operator’s performance will improve over time and maintenance time will decrease consequently. This is called learning effect.

In the past decades, the problem of maintenance and production has been investigated from several perspectives. Several models have been proposed in this research area and recent researches have tried to integrate the models. Budai et al. [4] reviewed the studies that have integrated production and maintenance models.

In this study, maintenance is performed by two main strategies: corrective and preventive maintenance. Corrective maintenance is due to occurrence of equipment failure while preventive maintenance is done in periodic time intervals.

Computer simulation is the process of designing a mathematical-logical model of a real system and experimenting with this model on a computer [27]. The optimization of maintenance strategies is difficult. All the parameters having a significant effect on the equipment and its performance lead to complex analytical models. It is not an easy task to create a maintenance planning model by mathematical methods; therefore, simulation is used in such cases [28]. In this study, an optimization problem of maintenance in an automated series manufacturing system is considered by simulating the manufacturing operation. Different scenarios are generated using different level of inputs in simulation approach.

AHP and TOPSIS methods have proven success in maintenance strategy selection, as it has been the case for many other decision-making problems. AHP method helps analysts organize the critical aspects of a problem into a hierarchical structure likea family tree. By reducing complex decisions to a series of simple comparisons and rankings, then synthesizing the results, the AHP not only helps analysts arrive at the best decision, but also provides a clear rationale for their choices [6]. Also, the basic principle of TOPSIS is that the chosen alternative should have the shortest distance from the positive ideal solution and the farthest distance from the negative ideal solution [17]. In this regard, some studies have been done on maintenance using AHP or TOPSIS to select the best maintenance policy [13, 16, 32, 35].

The literature review shows that the impact of maintenance parameters such as reliability and availability has been investigated in a few researches [5, 1214, 23, 36]. While former studies considered only a few features, this study investigates the influence of more parameters on improving decision making process in maintenance problems. The parameters which are considered in this study are reliability, availability, operator error and learning effect. Another advantage of this study is that the inputs are considered in a multiple form and the ranking of scenarios is exact. In this study, a multi-criteria model is presented that simulates different scenarios with maintenance parameters mentioned earlier. Since the effect of each criterion is not the same as others, assigning weights to criteria is important from decision makers’ point of view. In fact, weights allocation approach by means of AHP is flexible and the weights of the criteria can be different. Then, the rank of each scenario by considering weights of the criteria will be determined using TOPSIS method and the best policy is selected accordingly. The proposed algorithm of this study is suitable for small and medium-sized problems with different dimensions and can examine various options to find the best solution. As the matter of fact, the proposed algorithm can handle problems which are different in scenarios number. However, an increase in the number of machines causes testing different options for each machine to take much more time.

This study is comprised of the following structure: The first section expressed the introduction and relevant literature of maintenance, human error, simulation and AHP-TOPSIS. “Methodology” section presents methodology concepts including simulation and AHP-TOPSIS in more details. Then, the case study is introduced in “Experiment” section. In “Results and Discussion” section, the results are reported. Last but not least is the conclusion of this study in “Conclusion” section.

Fig. 1
figure 1

Integrated simulation-AHP-TOPSIS approach

Methodology

This study is aimed at modelling and scheduling maintenance activities for an operating unit. It can be met by exact analytical methods and mathematical models or simulation. Due to the effect of complex nature of the various parameters, the exact mathematical model for this problem is so complicated and even might not exist. Hence, in this case, simulation for modelling maintenance activities would be a better option. In this method, required data are collected from a production line of household appliances. Based on the data, failure and production functions are obtained. The more accurate the data are, the more accurate the estimate of function will appear to be. Significant parameters in maintenance planning optimization are estimated from the actual process and random behaviour of the system. In this study, Visual Simulation Language for Analogue Modelling has been used to design simulation network. Maintenance activities of the system are simulated, considering human errors and their learning effect. Production and maintenance functions are estimated according to the historical data of a production line of household appliances and by means of EASYFIT software. Input parameters are set to time to preventive maintenance, number of maintenance operators and operators’ skill. The combination of these parameters consists of several scenarios. The simulation network is executed for these scenarios and, as a result, several outputs are obtained.

Since, this problem is multi-criteria and the outputs have different natures with different units, an integrated AHP and TOPSIS method are used to find the optimal solution which is the combination of all parameters. Since usually in decision makers’ sight of views, the importance of criteria is not the same, aassigning different weights to the criteria based on experts’ opinions is mentioned in this study.

In Fig. 1, the approach is explained. The designed simulation network was run for several times with different inputs and different scenarios. Based on the nature of the problem, AHP-TOPSIS are identified as appropriate methods to rate the scenarios. In the following section; simulation, AHP and TOPSIS methods are described.

Simulation

Simulation is the process of modelling the real-world and conducting experiments by using the model. It is a powerful tool for understanding the behaviour of systems and analysing scenarios [27]. Computer simulation has been used in various engineering fields [10, 15, 21, 22]. One of the useful features of this study is to develop a simulation network for modelling maintenance activities that allow decision makers to adjust and change the parameters of the problem and make the best decision. The simulation is done by Visual Analogue Modelling as a fully object-oriented simulation language. The structure of Visual Analogue Modelling network will be described later.

The Integrated AHP-TOPSIS Approach

The analytic hierarchy process (AHP) methodology is a powerful tool in solving complex decision problems [29]. In AHP approach, the decision problem is structured hierarchically at different levels with each level consisting of a finite number of decision elements. The upper level of the hierarchy represents the overall goal, while the lower level consists of all possible alternatives. One or more intermediate level embodies the decision criteria and sub-criteria [25]. For pairwise comparisons, the 1–9 scale of [29] was used to determine the degree of importance of activities.

TOPSIS defines an index called similarity to the positive-ideal solution and the remoteness from the negative-ideal solution. TOPSIS is a multiple criteria method to identify solutions from a finite set of alternatives. One of the advantages of TOPSIS is that it is easy to compute and understand because the method directly receives a definite value from experts to calculate their final results.

The steps of TOPSIS approach are as follows:

Step 1 Creation of a decision matrix to rank, including m alternatives and n criteria, where \(\hbox {A}_{1},\hbox {A}_{2},\ldots ,\hbox {A}_{m}\) are different scenarios. \(\hbox {C}_{1},\hbox {C}_{2},\ldots ,\hbox {C}_{n}\) are different criteria. In this study, there are 128 scenarios (alternatives) and 9 criteria. In all following steps of TOPSIS approach, the parameters have been replaced with their values (\(\hbox {m}=128\) and \(\hbox {n}=9\)).

Step 2 The decision matrix is normalized as follows:

$$\begin{aligned} n_{ij} =\frac{x_{ij} }{\sqrt{\sum _{i=1}^{128} x_{ij}^2}},\quad j=1,2,\ldots , \end{aligned}$$
(1)

The value of each cell in the column is divided by the square of the sum of squares of the numbers.

Step 3 Assign weights to the normalized decision matrix, the matrix V is calculated as follows:

$$\begin{aligned} V_{ij} =n_{ij} \times w_j\quad i=1,2,\ldots ,128,\,j=1,2,\ldots ,9 \end{aligned}$$
(2)

Step 4 Determine the ideal and non-ideal solution, according to the following relations:

$$\begin{aligned}&A^{+}=\left( {v_1^+,v_2^+,\ldots ,v_{128}^+}\right) \nonumber \\&\quad v_i^+=max_i \left\{ {v_{ij}}\right\} ,\nonumber \\&\quad i=1,2,\ldots ,128,j=1,2,\ldots ,9 \end{aligned}$$
(3)
$$\begin{aligned}&A^{-}=\left( {v_1^-,v_2^-,\ldots ,v_9^-}\right) \nonumber \\&\quad v_i^-=min_i\left\{ {v_{ij}}\right\} \nonumber \\&\quad i=1,2,\ldots ,128,j=1,2,\ldots ,9 \end{aligned}$$
(4)

Step 5 Determine the distance of each alternative from the ideal solution as follows:

$$\begin{aligned} \hbox {d}_{\mathrm{i}}^+=\sqrt{\sum _{{\mathrm{j}}=1}^9 \left( {\hbox {v}_{\mathrm{j}}^+-\hbox {v}_{{\mathrm{ij}}}} \right) ^{2}}\quad \hbox {i}=1,2,\ldots ,128 \end{aligned}$$
(5)

And from the non-ideal solution as follows:

$$\begin{aligned} d_i^-=\sqrt{\sum _{j=1}^n \left( {v_j^- -v_{ij}} \right) ^{2}}\quad i=1,2,\ldots ,m \end{aligned}$$
(6)

Step 6 Calculate the relative closeness to the ideal solution; relative proximity switch is determined as follows:

$$\begin{aligned} cl_i^+ =\frac{D_i^-}{D_i^-+D_i^+}\quad i=1,2,\ldots ,128 \end{aligned}$$
(7)

Step 7 Rating options. The higher value indicates a better option. The best option is an option that holds more relative closeness to the ideal solution for men.

Experiment

In this section, the proposed approach is applied to a case study. A system consisting of four machines in a series production line of household appliances is considered. These machines periodically require maintenance. Each of these machines is prone to failure based on specific distribution functions. With growth of time to periodic maintenance, the probability of failure increases and this adds to the cost of the system. Because more time the system is halted, unforeseen expenses can be entered into the system and the production program might be interrupted. Also when the machine is halted, the process will be stopped.

Since the problem is a complex combination of optimization problem, some hypotheses are considered as follows to simplify it:

  • The machines can suddenly fail.

  • Production System is an automated system and the human error in this sector plays no role.

  • Operators may accidentally commit human error in any position, and the errors reduce the usual time between failures.

For running a simulation, data are needed as functions or numbers, so at first we have to define production functions and failure functions. By using EASYFIT software, different functions were tested and the functions were fitted to the input data (Table 1).

Table 1 The best fitted distribution functions using EASYFIT

After determining the inputs, network simulations have been plotted. Four machines in the network are in a row and output of each machine is input for the next one (Fig. 2).

Fig. 2
figure 2

Visual Simulation Language for Analogue Modelling consisting of four machines

In simulation network, there is an AWAIT section before each machine and if the machine is working or it is under maintenance operating, the entity waits at this section.

If the machine is not under maintenance or malfunction, the process starts on the entity that is in AWAIT section. After the completion of the process, the machine is set free by the free node and gets ready to serve the next entity or initiate maintenance and preventive repairs. Finally, COLCT and TERMINATE nodes collect such information as frequency of failures, frequency of preventive maintenance and time of entity in system and, then, the entity will be logged out.

Periodic maintenance of four separate networks, one for each machine is designed and displayed. In this network, the entity is generated by CREAT node. After the passage of machine maintenance, the entity will enter the ASSIGN node, which is the index for the number of maintenance and periodic upgrade, and, then, it enters to the ALTER nodes and uses capacity of the machine. Then, it enters to AWAIT node and waits for the operator’s maintenance. After that, periodic maintenance is performed with specific distribution and the capacity of the machine and the operator’s maintenance will change by ALTER and FREE nodes. Professional and normal maintenance operators may commit errors with defined probability during the operation. In this experiment, error rates of 5 and 10 % are attributed to expert and normal maintenance operators, respectively. It should be noted that these error rates have been obtained from past experiences and information. But, the highest human error rate is assigned to normal operators and the lowest one is assigned to expert ones. If the operator commits an error, ASSIGN node adds one number to operator error counters and simultaneously reduces the gap between machine failures. Then, during the interval, the next periodic maintenance re-enters the ASSIGN nodes and the number maintenance frequency is added periodically. Also operator learning effect is considered as a time-dependent function where operating time will decrease by the passage of time. This function is showed in Fig. 3. The network will be continuing until the end of simulation is repeated for each of the machines.

There is a network of random failures and a network is separately plotted for each machine. The network started with a CREAT node which inserts an entity into the network. The entity enters GOON node and the condition will be examined. If the length of time after the completion of the last check is longer than the expected length of time for failure, the unit will enter the network and resultant failures will occur. However, when it is less than the expected time for system failure, the unit will await another period and the above condition will be checked again.

If the mentioned condition is satisfied, and failure occurs, the entity immediately enters PREEMPT node and the machine enters to the service and repair section. Then, the entity enters the AWAIT node and waits for the operator to maintain and repair. When the machine maintenance operator is allocated to specified machine, service is performed according to the distribution of specified time. Then, the entity enters the FREE node and the capacity of the machine and the operator will be released. Finally, the entity enters the ASSIGN node which counts the number of unplanned downtime.

Periodic Maintenance and failure service operator of the network is separately considered and each varies between one and two. The performance and availability of the operator are involved in making the final decision of the maintenance personnel’s needs (Fig. 4).

Fig. 3
figure 3

Simulation network of preventive maintenance activities considering human error

Fig. 4
figure 4

Random failure simulation network

The network runs for 128 times for different combinations of periodic maintenance and has been analysed. The values are obtained by mean and lower limit machine downtime based on the records. Different duration of maintenance for each machine and also different number of maintenance operators (1 or 2) and their skills lead to different scenarios as shown in Table 2. It is noted that there are no redundant rules and all rules of this study are active. The operators are active and there is no redundant allocation.

Table 2 Levels of variables for different scenarios
Table 3 Results of computer simulation
Table 4 The final matrix of AHP method
Table 5 Normalized weighted matrix (V)
Table 6 Distance between the ideal and each scenario
Table 7 Distance between the non-ideal and each scenario

The results of the simulation are shown in Table 3. The outputs in the table are as follows:

  1. 1.

    Number of random machine failures (unanticipated failure of system represents the system reliability).

  2. 2.

    Number of PM activities.

  3. 3.

    Number of human errors (PM).

  4. 4.

    Average waiting time of machines (for service).

  5. 5.

    Percentage of machine usage (the greater percentage represents more smooth production and more accurate planning system).

  6. 6.

    Percentage of utilizing corrective maintenance operators (the time they do not wait to service).

  7. 7.

    Percentage of utilizing preventive maintenance operator (the time they do not wait to service).

  8. 8.

    Human cost (PM).

  9. 9.

    Average cycle time (products).

Results and Discussion

In order to rank the weighted scenarios, the integrated AHP and TOPSIS are used. By collecting comments from experts, developing paired comparison matrices, and using the geometric mean for each element in matrices, the final matrix of comments is provided (Table 4).

This final matrix of AHP is incompatible; therefore, the approximate total row is used to obtain weights (vector W). To this end, the sum of each row and column corresponding normalizing the result is obtained. At first, the sum of each row is calculated. For instance, the sum of row related to C2 is equal to 28.28. Then, the greatest value is determined among the calculated values and this value is associated with C1 (57.955). Finally, the weight of each item is obtained by dividing the sum of each row by 57.955. For example, w2 is equal to 28.28 divided by 57.955 and its value is 0.487. The vector of weights is as follows:

$$\begin{aligned} W= & {} (w_{1},w_{2},w_{3},w_{4},w_{5},w_{6},w_{7},w_{8},w_{9})\\= & {} (1,0.487,0.554,0.112,0.296,0.550,\\&0.074,0.226,0.499) \end{aligned}$$

Then, the matrix V, the result of multiplying the normalized matrix and weight vector is shown in Table 5.

The distance between the ideal and also the non-ideal of each scenario is shown in Tables 6 and 7, respectively.

As mentioned earlier, the method used in this study consisted of simulation, AHP and TOPSIS that assign different weights to thecriteria based on expert judgments. Numbers of random failures of machines and operators’ errors are used to model reliability in this study. Number of preventive maintenance activities must be optimized to enhance down time. This study optimizes operator utilization, cost and average time entities (products) spent in system. Regarding the above parameters, the best case scenarios in terms of a clear trend is identified. Optimum scenarios are 96, 82 and 90 as shown in Table 8. They lead to the best possible maintenance planning. In the mentioned scenarios, the normal operator that leads to lower cost was chosen. Probably, the number of five pieces has received more appropriate scheduled preventive maintenance, and the average time of entities’ presence in system is considerably low. It indicates a higher production rate compared to other scenarios.

Table 8 Scenarios ranking based on AHP-TOPSIS approach

Conclusion

In this study, a unique and integrated approach was proposed for planning maintenance activities in an actual manufacturing system. Maintenance planning problems were discussed from various aspects and solved by various methods. However, integration of computer simulation and AHP-TOPSIS methods for maintenance planning problems incorporating human error and learning effects have not been considered in recent studies. Hence, incorporation of human error and learning effects in parallel to reliability and availability indicators in the integrated approach of this study makes it a quite unique approach for maintenance planning problems. Optimum maintenance planning is obtained from the best scenarios retrieved from AHP and TOPSIS. Considering the same weights for all criteria is usually far from reality, and is not feasible for decision making process. Thus, AHP-TOPSIS is used to consider weighting criteria based on experts’ judgements. The results of this study indicated that that there is no failure and operators are identified as normal skilled workers in the best scenarios. The numbers of five scheduled preventive maintenance pieces seem to be appropriate and the average wait time is reasonably low. An advantage of the proposed method is the ability to add and change various parameters for sensitivity analysis. Changing the input of each parameter can change the result for each output. The results of this study are helpful for managers and other decision makers to select the best policy for maintenance planning.