Introduction

In the present era of modernization, machines have become part and partial of our day to day life. Due to automation of many systems, we are now completely dependent on the machines as it has become very difficult to even imagine our life without machines. But unfortunately, we cannot rely on the machines completely as they are always prone to failure. The failures of the machines affect the system adversely by reducing the efficiency and thereby increasing the overall cost of the system. Thus, it has become a very difficult task for the system developer to design a completely reliable machining system which can operate without any hindrance in spite of component failures. The provision of having spare machines in the system is one of the key approaches to cope up with the failure of operating machines and carrying out the machining operation smoothly without any interruptions. Multi-component systems with the provision of redundancy and maintainability are commonly seen in industrial scenarios, namely production systems, computer networks, transportation system, etc.

The spare machines are those machines which are put in place of the failed machines in the main system just like the main operating machines to carry out the functions properly and continuously. In the present paper, we study a Markovian machine repair problem with mixed standbys under the care of a repair facility having one permanent repairman and another additional removable repairman which turn on according to N-policy and threshold policy, respectively. The repairs to the failed machines are rendered on the time sharing basis. The time-shared policy is used for allocating the repairman capacity of the repair job to be utilized simultaneously by all failed machines by slicing the unit repair time among all failed machines as per round robin discipline. Each failed machine receives time slice of permanent as well as additional removable repairmen and recovers from the faults after receiving several quantum of repair time depending on the severity of the faults/damage. Taylor and Jackson (1954) introduced the standby provisioning in machine repair system by incorporating the use of cold standbys in a machine repair system. In the past years, a lot of researches have been done on the repairable machining system with standbys by many renowned queue theoreticians (cf. Albright 1980; Wang and Sivazlian 1989; Wang 1995; Jain and Baghel 2001). Haque and Armstrong (2007), Jain et al. (2010) and Jain and Gupta (2011) presented a brief review on the machine interference problem (MRP) with spares in their review articles. In recent years, many researchers also contributed to study on the performance prediction of the machine repair problems with standbys by incorporating some other distinct features like vacation, heterogeneous repairmen, N-policy, etc. (see Ke et al. 2009; Maheshwari et al. 2010; Jain et al. 2012). By including the F-policy Kumar and Jain (2013a) developed a queueing model for the performance prediction of the machine repair problems with standbys. A machine repair problem with standbys, having unreliable repairman and working vacation for the repairman, was investigated by Jain (2014).

In many machine repair systems, it has been often seen that the service rendered by the permanent repairman becomes too expensive as it may be idle most of the times. To reduce the cost for such system, the repairman can initiate the service only when a certain workload is build up. Yadin and Naor (1963) introduced the N-policy which can be used to control the service rendered by the repairman in optimal manner. Optimal N-policy ensures that the repairman will be activated only when N failed machines are accumulated in the system. This will be helpful in reduction of the expenditures such as set up cost to initiate the busy period after each idle period. According to N-policy, there will be no service provided by the repairman until the queue length is build up to N and once the service is started, the repairman stops service only when the queue becomes empty. There are some important research works available in the literature on MRP operating under N-policy (cf. Shawky 2000; Jain et al. 2004; Jain and Bhargava 2009). A few papers on N-policy for the MRP have been reported in the survey article by Sharma (2012). Machine repair problems with spares were investigated by Yue et al. (2012) and Kumar and Jain (2013a, b) by incorporating the N-policy. The performance modeling of multi-component machining systems under the care of unreliable repairman which operates according to N-policy was done by Jain et al. (2014b, c). Jayachitra and Albert (2014) presented an elaborated survey on various queueing models under N-policy which also includes the works on MRP under N-policy.

To tackle with the congestion problem, the provision of additional repairmen to the system may be helpful in reducing the workload of the permanent repairman and up gradation of the repair facility provided to the system. The feature of varying number of repairmen in queueing system has been studied by many researchers in the recent past (Shawky 1997; Jain 1998; Jain and Singh 2003). Later some works were also done on MRP with additional repairman by some renowned queue theorists (Al-seedy and Al-Ibraheem 2001; Sharma et al. 2005; Jain et al. 2007a, b). In recent years, some remarkable works have been done by the researchers by deploying the additional repairmen in machine repair system based on workload level. The important Markovian studies done on the same line in the last couple of years have been reviewed here. Markovian machine repair problems with additional repairman were investigated by Huang et al. (2011) and Liou et al. (2013) by including threshold control policies. Maheshwari and Ali (2013) studied an M/M/C/K/N machine repair problem with additional repairman by incorporating the concept of discouragement. A MRP with mixed standbys having the provision of permanent as well as additional repairmen was also investigated by Jain and Preeti (2014) by evaluating the probabilities for the transient states of the system which are further used to carry out the performance prediction of system characteristics.

The time sharing concept emerges in the 1960s to make the computer systems an object of public utility, i.e., to make it useful for more and more people (cf. Kleinrock 1967). In the context of machine repair system, the time-shared systems are the repair facility wherein the technical staff repairs the failed machines on time slicing basic, i.e., by dividing its unit time among all those failed machines which are presently waiting in the queue to be repaired by the repairman. The repairman will attend a failed machine for a pre-specified fixed quantum of time only and then it will move to the next machine in the queue. If the service of the first machine is not completed in that time interval then it will be put in the end in the queue to be served again. The repairman will return to the first machine after rendering its service to each of all other failed machines present in the queue for the fixed (pre-specified) small duration of time. This way, the whole cycle will go on by following the round-robin discipline. As soon as the repair of individual failed machine is completed during the repair cycle, it is removed from the queue. The newly failed machines will join the queue at the last position. The fraction of the total service time offered to any failed machine will depend upon the number of failed machines waiting in the queue for their service at the repair facility. In a specific case, we can assume the sharing factor of servers’ time according to harmonic variation of individual capacity among. The number of users present in the systems (cf. Kleinrock 1967; Coffman and Kleinrock 1968; Adiri and Avi-Itzhak 1969). For the early notable contribution on time-shared systems, we refer Klienrock (1967). In the past, some other important contributions on time-shared computer systems are due to Rasch (1970), Yashkov (1992), Wang and Tai (2000). Yashkov and Yashkova (2007) have presented a survey article on processor-shared queueing systems which presents an overview of the work done so far on the concerned topic. In recent years, other related research works on the time-shared queueing systems by incorporating various distinct features have been done by Zhean and Knessl (2009), Altman et al. (2010), Tahar and Jean-Marie (2012), and others.

Sometimes, it is also observed that due to fewer failed machines in the system there may be less workload of the repairing as such the repairman may remain idle most of the time which is the wastage of resources and time. So in order to avoid this situation and to utilize the repair facility optimally, the concept of additional removable repairman is better option and can be employed in time sharing machine repair systems also. In the timesharing system, all the failed machines are served by the repairman at the same time through various repair positions. Such scenario of time sharing in machine repair problem can be seen in automobile repair shop of travel agency where a limited registered vehicles are repaired and the permanent repairman starts the concurrent repair jobs on the vehicles by slicing unit time only when some vehicles have joined the repair shop. In case of high workload when a certain number of failed vehicles have already joined the repair shop, the secondary repairman is called upon to render the repair. A lot of research works on the time-shared systems have appeared in the queueing literature (cf. Jain and Lata 1995; Jain et al. 2005; Kim and Kim 2007). In recent years, Chandrasekaran et al. (2013) and Jeong et al. (2014) investigated the optimization issues of machining system used for cloud computing. More recently, time-shared machining systems have been studied by Flapper et al. (2014) in manufacturing–remanufacturing system. Jain et al. (2014a) analyzed the sensitivity of a machine repair problem with two types of spares and controlled rates by incorporating the concept of time sharing.

In the present investigation, a time-shared machine repair problem with mixed (cold and warm) standbys has been studied. There is provision of permanent as well as one additional repairman; the permanent repairman follows the N-policy whereas additional removable repairman is introduced when the workload of failed machines crosses a certain threshold level. The noble feature of the present model over other existing models lies in the incorporation of many key realistic factors such as N-policy, time sharing concept, provision of mixed standbys, and facility of additional removable server in case of heavy workload in a combined and collaborated manner for the performance modeling of machine repair system. It is to be worth mentioning that the permanent repairman follows N-policy, i.e., starts working only when N failed machines are accumulated in the system. The secondary additional repairman is called upon as and when the workload of failed machines crosses a critical threshold level. Both repairmen work on the time sharing basis which means that both of them repair the failed machines present in the queue by sharing their time with all failed machines accumulated in the system. By constructing Chapman–Kolmogorov equations, the steady-state probabilities have been evaluated using the recursive solution approach. The rest of the paper is organized in different sections as follows. The description of the model and the differential equations, which governs the model, is given in “The model” and “Governing equations”, respectively. In “Queue size distribution”, the queue size distribution is obtained using recursive method. In “Special cases”, some particular cases are deduced by setting appropriate parameter values. The queue size distribution is used to derive various performance measures and cost function which has been explained in “Performance indices” and “Cost function”, respectively. To validate the tractability of the analytical results, the numerical simulation has been provided in “Numerical analysis”. To summarize the findings and highlight the noble features of the work done, the concluding remarks have been given in the last section on “Discussion”.

The model

Consider a time-shared machining system with mixed standby support and under the care of repair facility having permanent and additional removable servers. The permanent repairman operates under the N-policy whereas the additional repairman is called upon according to a threshold policy to reduce the workload of permanent repairman. For developing Markov model, we have made the underlying assumptions:

  • The machining system is composed of Y cold and S warm standbys machines along with M operating machines. The system operates under the (m, M) policy, i.e., the system can work with at least m (<M) machines in short mode whereas M operating machines are required for the normal functioning of the machining system.

  • The life times of the operating and standby machines follow the exponential distribution. The operating machines may fail with rate of λ and the failure rate of the cold standby is zero whereas the warm standbys fail with a rate of α.

  • After the use of all spares, the system starts to fail in a degraded fashion with a failure rate λ d.

  • There is provision of two repairmen for the repair of the failed machines in the maintenance facility; the first one is appointed on the permanent basis and the second one is secondary removable repairmen which can be called upon to reduce the burden of loaded permanent repairman. The permanent and additional repairmen provide the repair following the exponential distribution at the rate of µ and µ a, respectively. The permanent and additional repairmen turn on according to N-policy and a threshold policy respectively, on time sharing basis. The first permanent repairman follows the N-policy according to which it starts the repair work only when there are N failed machines accumulated in the system. Once the permanent repairman initiates the repair, it continues its job in time sharing manner till all the failed machines are repaired.

  • The additional removable repairman gets activated when all spare machines are exhausted and the system will go in the short mode with the occurrence of failure of next machine. Thus, to prevent the system to work in degraded mode, the additional server will be called upon at a threshold level N 1 = Y + S. Furthermore, it becomes deactivated as soon as the workload of failed machines drops below N 1.

  • The failed machines are repaired by the repairmen following the FCFS rule, i.e., the failed machines are queued up in the order in which they failed and join the system. Both the repairmen provide the repair on the time sharing basis. They take care of all the failed machines in the queue for a small interval of time as the time has been shared equally by all available failed machines in the queue. The machine which has been attended by the repairman will join the queue to be served again if its repair has not been completed otherwise it will leave the system. The rate of sharing time by both repairmen is ϕ(n) which can be considered as the reciprocal of the available numbers of failed machines in the queue.

  • In case of failure of any machine, the switchover time of the standby machine (if available) from standby state to operating state of the machines is considered to be instantaneous. It is to be mentioned that the cold standbys are used to switch over the failed machines before warm standbys (cf. Gross et al. 2009; Jain et al. 2012; Maheshwari and Ali 2013).

Let λ n and µ n denote the down and up transition rates corresponding to exponentially distributed life and repair processes of the machines, respectively; here, suffix ‘n’ denote the number of failed machines in the system. The state transition diagram, showing in-flows and out-flows of system states, is depicted in Fig. 1. The state-dependent failure and repair rates are defined as follows:

$$ \lambda_n = \left\{ \begin{aligned} & M\lambda + S\alpha ;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;0 \le n \le Y \\ & M\lambda + ({Y + S - n})\alpha ; \;\;\;\;\;Y < n \le Y + S \\ & ({M + Y + S - n})\lambda_d ; \;\;\;\;\;Y + S < n \le L = M + Y + S - m + 1 \\ \end{aligned} \right. $$

and

$$ \mu_n = \left\{ \begin{aligned} & \mu \phi (n); \;\;\;\quad\qquad 1 \le n < Y + S \\ & ({\mu + \mu_{\text{a}} })\phi (n);\;\;\;\;Y + S \le n \le L = M + Y + S - m + 1 \\ \end{aligned} \right. $$
Fig. 1
figure 1

State transition diagram

We denote the steady-state probabilities of the system states when there are ‘n’ failed machines in the system, as follows:

P 0,n :

The steady-state probability that there is n failed machine in the system which is in accumulation state.

P 1,n :

The steady-state probability that the first or both repairmen are activated and there are n numbers of failed machines present in the system at any instant.

\( P_{Y + S} (1) \) :

Probability that first permanent repairman is performing the repair work of the failed machine at the threshold level Y + S when all standby machines are used.

\( P_{Y + S} (2) \) :

Probability that second additional repairman is performing the repair work of the failed machine at the threshold level Y + S when all standby machines are used.

Governing equations

In this section, Chapman–Kolmogorov equations for all the states of the system using the appropriate transition rates for three different situations \( (N = Y, N < Y {\text{and}}\,\, N > Y) \) have been constructed.

Case I: The first repairman starts repair when all cold standby machines (Y) are exhausted, i.e. when N = Y

$$ - ({M\lambda + S\alpha })P_{0,0} + \mu \phi (1)P_{1,1} = 0 $$
(1)
$$ \begin{aligned} - ({M\lambda + S\alpha })P_{0,n} + ({M\lambda + S\alpha })P_{0,n - 1} = 0; \hfill \\ 1 \le n \le N - 1 \hfill \\ \end{aligned} $$
(2)
$$ - [{M\lambda + S\alpha + \mu \phi (1)}]P_{1,1} + \mu \phi (2)P_{1,2} = 0 $$
(3)
$$ \begin{aligned} - [{M\lambda + S\alpha + \mu \phi (n)}]P_{1,n} + ({M\lambda + S\alpha })P_{1,n - 1} \hfill \\ + \mu \phi ({n + 1})P_{1,n + 1} = 0;\;\;\;2 \le n \le N - 1 \hfill \\ \end{aligned} $$
(4)
$$ \begin{aligned} - [{M\lambda + S\alpha + \mu \phi (N)}]P_{1,N} + ({M\lambda + S\alpha })P_{1,N - 1} \hfill \\ + \mu \phi ({N + 1})P_{1,N + 1 } + ({M\lambda + S\alpha })P_{0,N - 1} = 0 \hfill \\ \end{aligned} $$
(5)
$$ \begin{aligned} - [{M\lambda + ({Y + S - n})\alpha + \mu \phi (n)}]P_{1,n} \hfill \\ + [{M\lambda + ({Y + S - n + 1})\alpha }]P_{1,n - 1} \hfill \\ + \mu \phi ({n + 1})P_{1,n + 1} = 0; \hfill \\ N + 1 \le n \le Y + S - 2 \hfill \\ \end{aligned} $$
(6)
$$ \begin{aligned} - [{M\lambda + \alpha + \mu \phi ({Y + S - 1})}]P_{1,({Y + S - 1})} \hfill \\ + ({M\lambda + 2\alpha })P_{1,Y + S - 2} + \mu \phi ({Y + S})P_{Y + S} (1) \hfill \\ + \mu_{\text{a}} \phi ({Y + S})P_{Y + S} (2) = 0 \hfill \\ \end{aligned} $$
(7)
$$ \begin{aligned} - [{M\lambda + \mu \phi ({Y + S})}]P_{({Y + S})} (1) \hfill \\ + ({M\lambda + \alpha })P_{1,Y + S - 1} \hfill \\ + \mu_{\text{a}} \phi ({Y + S + 1})P_{1,Y + S + 1} = 0 \hfill \\ \end{aligned} $$
(8)
$$ \begin{aligned} - [{M\lambda + \mu_{\text{a}} \phi ({Y + S})}]P_{Y + S} (2) \hfill \\ + \mu \phi ({Y + S + 1})P_{1,Y + S + 1} = 0 \hfill \\ \end{aligned} $$
(9)
$$ \begin{aligned} - [{({M - 1})\lambda_{\text{d}} + ({\mu + \mu_{\text{a}} })\phi ({Y + S + 1})}]P_{1,Y + S + 1} \hfill \\ + M\lambda P_{Y + S} (1) + M\lambda P_{Y + S} (2) \hfill \\ ({ \mu + \mu_{\text{a}} })\phi ({Y + S + 2})P_{1,Y + S + 2} = 0 \hfill \\ \end{aligned} $$
(10)
$$ \begin{aligned} - [{({M + Y + S - n})\lambda_{\text{d}} + ({\mu + \mu_{\text{a}} })\phi (n)}]P_{1,n} \hfill \\ + ({M + Y + S - n + 1})\lambda_{\text{d}} P_{1,n - 1} \hfill \\ + ({ \mu + \mu_{\text{a}} })\phi ({n + 1})P_{1,n + 1} = 0; \hfill \\ Y + S + 2 \le n < L \hfill \\ \end{aligned} $$
(11)
$$ - ({ \mu + \mu_{\text{a}} })\phi (L)P_{1,L} + m\lambda_{\text{d}} P_{1,L - 1} = 0 $$
(12)

Case II: The number of cold standby machines (Y) is less than the threshold value (N) at which the repair starts, i.e. when N < Y

The steady-state probabilities of the states (0, n) for 0 ≤ n ≤ N − 1 are governed by Eqs. (1)–(2). Also for the states (1, n) when 1 ≤ n ≤ N, we can refer Eqs. (3)–(5). Now, we construct the equations for the states (1, N + 1) to (1, Y + S − 2) as follows:

$$ \begin{aligned} - [{M\lambda + S\alpha + \mu \phi (n)}]P_{1,n} + ({M\lambda + S\alpha })P_{1,n - 1} \hfill \\ + \mu \phi ({n + 1})P_{1,n + 1} = 0; N + 1 \le n \le Y \hfill \\ \end{aligned} $$
(13)
$$ \begin{aligned} - [{M\lambda + ({S - 1})\alpha + \mu \phi ({Y + 1})}]P_{1,Y + 1} \hfill \\ + ({M\lambda + S\alpha })P_{1,Y} + \mu \phi ({Y + 2})P_{1,Y + 2} = 0 \hfill \\ \end{aligned} $$
(14)
$$ \begin{aligned} - [{M\lambda + ({Y + S - n})\alpha + \mu \phi (n)}]P_{1,n} \hfill \\ + [{M\lambda + ({Y + S - n + 1})\alpha }]P_{1,n - 1} \hfill \\ + \mu \phi ({n + 1})P_{1,n + 1} = 0; \hfill \\ Y + 2 \le n \le Y + S - 2 \hfill \\ \end{aligned} $$
(15)

For the range (1, Y + S − 1) to (1, L), Eqs. (7)–(12) hold.

Case III: The threshold parameter (N) is more than the number of cold standby machines (Y) and is less than the total number (Y + S) of standbys machines, i.e. when Y < N < Y + S

In this case, Eqs. (1) and (3), will hold for the states (0, 0) and (1, 1), respectively. For the states (0, 1) to (0, N − 1), we have the following equations:

$$ - ({M\lambda + S\alpha })P_{0,n} + ({M\lambda + S\alpha })P_{0,n - 1} = 0; 1 \le n \le Y $$
(16)
$$ - [{M\lambda + ({S - 1})\alpha }]P_{0,Y + 1} + ({M\lambda + S\alpha })P_{0,Y} = 0 $$
(17)
$$ \begin{aligned} - [{M\lambda + ({Y + S - n})\alpha }]P_{0,n} \hfill \\ + [{M\lambda + ({Y + S - n + 1})\alpha }]P_{0,n - 1} = 0; \hfill \\ Y + 2 \le n \le N - 1 \hfill \\ \end{aligned} $$
(18)

For the states (1, 2) to (1, N), we construct the following equations:

$$ \begin{aligned} - [{M\lambda + S\alpha + \mu \phi (n)}]P_{1,n} + ({M\lambda + S\alpha })P_{1,n - 1} \hfill \\ + \mu \phi ({n + 1})P_{1,n + 1} = 0;2 \le n \le Y \hfill \\ \end{aligned} $$
(19)
$$ \begin{aligned} - [{M\lambda + ({S - 1})\alpha + \mu \phi ({Y + 1})}]P_{1,Y + 1} \hfill \\ + ({M\lambda + S\alpha })P_{1,Y} + \mu \phi ({Y + 2})P_{1,Y + 2} = 0 \hfill \\ \end{aligned} $$
(20)
$$ \begin{aligned} {-[{M\lambda + ({Y + S - n})\alpha + \mu \phi (n)}]}P_{1,n} \hfill \\ + [{M\lambda + ({Y + S - n + 1})\alpha }]P_{1,n - 1} \hfill \\ + \mu \phi ({n + 1})P_{1,n + 1} = 0; \hfill \\ Y + 2 \le n \le N - 1 \hfill \\ \end{aligned} $$
(21)
$$ \begin{aligned} {-[{M\lambda + ({Y + S - N})\alpha + \mu \phi (N)}]}P_{1,N} \hfill \\ + [{M\lambda + ({Y + S - N + 1})\alpha }]P_{1,N - 1} \hfill \\ + \mu \phi ({N + 1})P_{1,N + 1} \hfill \\ + [{M\lambda + ({Y + S - N + 1})\alpha }]P_{0,N - 1} = 0 \hfill \\ \end{aligned} $$
(22)

For the states in the range (1, N + 1) to (1, L), Eqs. (6)–(12) also hold.

Queue size distribution

The queue size for the steady-state probabilities P 0,n and P 1,n can be obtained by solving the governing equations, which can be further used for the evaluation of the performance measures of interest.

Case I: When N = Y

In this case, when the number of cold standbys and the threshold level N are equal, the queue size for the steady-state probabilities P 0,n , P 1,n , \( P_{Y + S} (1) \) and \( P_{Y + S} (2) \) can be obtained by solving the Eqs. (1)–(12) recursively in the following manner.

Equation (1) can be written as:

$$ P_{1,1} = \frac{\varLambda }{a_1 } P_{0.0} $$
(23)

where \( \varLambda = M\lambda + S\alpha ; \;\;\;a_n = \mu \phi (n). \)

From Eq. (2), we can get

$$ P_{0,n} = P_{0,0} ; 1 \le n \le N - 1 $$
(24)

Solving Eqs. (3) and (4), we obtain

$$ P_{1,n} = \frac{{\varLambda^n + \sum_{k = 1}^{n - 1} \varLambda^{n - k} \gamma^{(k)} }}{{\gamma^{(n)} }}P_{0.0} ;\;\;1 \le n \le N $$
(25)

where \( \gamma^{(n)} = \prod \nolimits_{j = 1}^n a_j. \)

On solving Eq. (5) for n = N, we obtain

$$ P_{1,N + 1} = \frac{B_1 }{{\gamma^{(N + 1)} }}P_{0.0} $$
(26)

where, \( B_1 = \varLambda^{Y + 1} + \sum_{k = 1}^{Y - 1} {\varLambda^{Y + 1 - k} \gamma^{(k)} } \) (as we already know that we are considering the case when N = Y).

Again, solving recursively for n = N + 1 to Y + S − 2, we obtain

$$ P_{1,n} = \frac{B_1 }{{\gamma^{(n)} }}\prod \limits_{i = 1}^{n - Y - 1} ({\lambda_{Y + i} })P_{0.0} ;\;\;N + 2 \le n \le Y + S - 1 $$
(27)

where \( \lambda_n = M\lambda + ({Y + S - n})\alpha. \)

To obtain the steady-state probabilities for the nodes Y + S, Y + S + 1 and Y + S + 2, i.e., \( P_{1,Y + S} \), \( P_{1, Y + S + 1 } \), \( P_{1, Y + S + 2} \), we solve the Eqs. (7), (8), (9) and (10), recursively. Solving Eq. (7) and (8), we get

$$ P_{Y + S} (1) = \frac{B_1 }{{\gamma^{(Y + S)} }}\prod \limits_{i = 1}^{S - 1} ({\lambda_{Y + i} })P_{0.0} - \frac{{\mu_{\text{a}} }}{\mu }P_{Y + S} (2) $$
(28)
$$ \begin{aligned} P_{1,Y + S + 1} = & \frac{{B_1 \prod_{i = 1}^S ({\lambda_{Y + i} })}}{{\mu_a \phi ({Y + S + 1})\gamma^{(Y + S)} }}P_{0,0} \\& - \left[{\frac{{\lambda_{Y + S} + a_{Y + S} }}{{a_{Y + S + 1} }}}\right]P_{Y + S} (2) \\ \end{aligned} $$
(29)

Multiplying the above equation by \( \mu \phi ({Y + S + 1}) \) on both sides and then solving it simultaneously with Eq. (9), we get

$$ P_{Y + S} (2) = \frac{\mu }{{\mu_{\text{a}} }}C_1 \xi_1 P_{0,0} $$
(30)

where \( \xi_1 = \frac{{B_1 \prod_{i = 1}^S ({\lambda_{Y + i} })}}{{\gamma^{(Y + S)} }} \) and \( C_1 = \frac{1}{{2\lambda_{Y + S} + ({\mu + \mu_a })\phi (Y + S)}} \). On substituting the value of \( P_{Y + S} (2) \) in Eqs. (28) and (29), we get

$$ P_{Y + S} (1) = \frac{{\lambda_{Y + S} + ({\mu + \mu_{\text{a}} })\phi ({Y + S})}}{{\lambda_{Y + S} }}C_1 \xi_1 P_{0,0} $$
(31)

and

$$ P_{1, Y + S + 1} = \frac{{\lambda_{Y + S} + \mu_{\text{a}} \phi ({Y + S})}}{{\mu_{\text{a}} \phi ({Y + S + 1})}}C_1 \xi_1 P_{0,0} $$
(32)
$$ P_{1,Y + S} = P_{Y + S} (1) + P_{Y + S} (2) $$
(33)

From Eqs. (10), (11) and (12), we can obtain the results for the remaining states as:

$$ \begin{aligned} P_{1,n} &= \frac{{\lambda_{Y + S} + \mu_{\text{a}} \phi ({Y + S})}}{{\mu_{\text{a}} \phi ({Y + S + 1})}} \\ \quad\times \prod \limits_{i = 1}^{n - ({Y + S}) - 1} \left[{\frac{{\lambda_{Y + S + i}^{\prime} }}{{({\mu + \mu_{\text{a}} })\phi ({Y + S + i - 1})}}}\right] \times C_1 \xi_1 P_{0,0} ; \\ Y + S + 2 \le n \le L \\ \end{aligned} $$
(34)

where \( \lambda_n^{\prime}= ({M + Y + S - n})\lambda_d. \)

Thus, the queue size distribution for case (I) is given in the Eqs. (24)–(27) and (30)–(34). To obtain the probability \( P_{0,0} \), the following normalizing condition is used:

$$ \sum \limits_{i = 0}^{N - 1} P_{0,i} + \sum \limits_{j = 1}^L P_{1,j} = 1 $$
(35)

Case II: When \( N < Y \)

In the similar manner as in case I, on solving Eqs. (1)–(5), (7)–(15) recursively and using the notations:

$$ B = \varLambda^n + \sum \limits_{k = 1}^{n - 2} \varLambda^{n - k} \gamma^{(k)} , \xi = \frac{{B\prod_{i = 1}^S \lambda_{Y + i} }}{{\gamma^{(Y + S)} }}, $$
$$ C = \frac{1}{{2\lambda_{Y + S} + ({\mu + \mu_{\text{a}} })\phi ({Y + S})}}, $$
$$ Z = \sum \limits_{k = 1}^{n - Y - 1} \left({\prod \limits_{i = k + 1}^{n - Y - 1} \lambda_{Y + i} }\right)\gamma^{(Y + k )}, $$
$$ P_{Y + S} (1) = \frac{{\lambda_{Y + S} + ({\mu + \mu_{\text{a}} })\phi ({Y + S})}}{{\lambda_{Y + S} }} \times C\xi P_{0,0} , $$
$$\text{and}\; P_{Y + S} (2) = \frac{\mu }{{\mu_{\text{a}} }}C\xi P_{0,0} $$

and other notations being same as used in case I, we obtain the queue size distribution as follows:

$$ P_{0,n} = P_{0,0} ; 1 \le n \le N - 1 $$
(36)
$$ P_{1,n} = \left\{ \begin{aligned} \left({\frac{{\varLambda^n + \sum_{k = 1}^{n - 1} [\varLambda^{n - k} \gamma^{(k)} ]}}{{\gamma^{(n)} }}}\right)P_{0.0} ; 1 \le n \le N \hfill \\ \left({\frac{{\varLambda^n + \sum_{k = 1}^{n - 2} [{\varLambda^{n - k} \gamma^{(k)} }]}}{{\gamma^{(n)} }}}\right)P_{0.0} ;\;\;N + 1 \le n \le Y + 1 \hfill \\ \frac{{B\prod_{i = 1}^{n - Y - 1} ({\lambda_{Y + i} }) + \frac{\varLambda^2 }{a_Y }Z}}{{\gamma^{(n)} }} P_{0.0} ;\;\;Y + 2 \le n \le Y + S - 1 \hfill \\ P_{Y + S} (1) + P_{Y + S} (2); n = Y + S \hfill \\ \frac{{\lambda_{Y + S} + \mu_{\text{a}} \phi ({Y + S})}}{{\mu_{\text{a}} \phi ({Y + S + 1})}}C\xi P_{0,0} ;\;\;n = Y + S + 1 \hfill \\ \frac{{\lambda_{Y + S} + \mu_a \phi ({Y + S})}}{{\mu_{\text{a}} \phi ({Y + S + 1})}} \hfill \\ \times \prod \limits_{i = 1}^{n - ({Y + S}) - 1} \left[{\frac{{\lambda_{Y + S + i}^{\prime} }}{{({\mu + \mu_{\text{a}} })\phi_{Y + S + i + 1} }}}\right] \hfill \\ \times C\xi P_{0,0} ; \hfill \\ Y + S + 2 \le n \le L \hfill \\ \end{aligned} \right. $$
(37)

Case III: When \( N > Y \)

Using the following notations,

$$ B_2 = \varLambda^{Y + 1} + \sum \limits_{k = 1}^Y ({\varLambda^{Y - k + 1} \gamma^{(k)} }), $$
$$ \xi_2 = \frac{{B_2 \prod_{i = 1}^{Y + S - N} \lambda_{N + i} }}{{\gamma^{(Y + S)} }}, $$
$$ C_2 = \frac{1}{{2\lambda_{Y + S} + ({\mu + \mu_{\text{a}} })\phi ({Y + S})}} , $$
$$ Z_1 = \sum \limits_{k = 1}^{n - Y - 1} \left({\prod \limits_{i = k + 1}^{n - Y - 1} \lambda_{Y + i} }\right)\gamma^{(Y + k )} , $$

we have

$$ P_{Y + S} (1) = \frac{{\lambda_{Y + S} + ({\mu + \mu_{\text{a}} })\phi ({Y + S})}}{{\lambda_{Y + S} }} \times C_2 \xi_2 P_{0,0 } $$

and

$$ P_{Y + S} (2) = \frac{\mu }{{\mu_{\text{a}} }}C_2 \xi_2 P_{0,0} $$

From Eqs. (1), (3), (6)–(12) and (16)–(22), we derive the queue size as follows:

$$ P_{0,n} = \left\{ {\begin{array}{*{20}c} {P_{0,0} ; 1 \le n \le Y} \\ {\frac{\varLambda }{{\gamma^{(n)} }}P_{0,0} ; \;\;Y + 1 \le n \le N } \\ \end{array} } \right. $$
(38)
$$ P_{1,n} = \left\{ \begin{aligned} \left({\frac{{\varLambda^n + \sum_{k = 1}^{n - 1} ({\varLambda^{n - k} \gamma^{(k)} })}}{{\gamma^{(n)} }}}\right)P_{0.0} ;\;\;1 \le n \le Y + 1 \hfill \\ \frac{{B_2 \prod_{i = 1}^{n - Y - 1} ({\lambda_{Y + i} }) + \varLambda Z_1 }}{{\gamma^{(n)} }}P_{0.0} ;\;\;Y + 2 \le n \le N \hfill \\ \frac{B_2 }{{\gamma^{(n)} }}\prod \limits_{i = 1}^{n - Y - 1} \lambda_{Y + i} P_{0.0} ;\;\;N + 1 \le n \le Y + S - 1 \hfill \\ P_{Y + S} (1) + P_{Y + S} (2); n = Y + S \hfill \\ \frac{{\lambda_{Y + S} + \mu_{\text{a}} \phi ({Y + S})}}{{ \mu_{\text{a}} \phi ({Y + S + 1})}}C_2 \xi_2 P_{0,0} ;\;\; n = Y + S + 1 \hfill \\ \frac{{\lambda_{Y + S} + \mu_{\text{a}} \phi ({Y + S})}}{{\mu_{\text{a}} \phi ({Y + S + 1})}} \hfill \\ \times \prod \limits_{i = 1}^{n - ({Y + S}) - 1} \left[{\frac{{\lambda_{Y + S + i}^{\prime} }}{{({\mu + \mu_{\text{a}} })\phi_{Y + S + i + 1} }}}\right] \hfill \\ \times C_2 \xi_2 P_{0,0} ;\;\; Y + S + 2 \le n \le L \hfill \\ \end{aligned} \right. $$
(39)

Special cases

To establish the validity of the results obtained in “Queue size distribution” (Case I), we explore some special cases by varying the values of \( \phi (n) \) and \( N \) as follows:

  1. 1.

    If \( \phi (n) = \frac{1}{n} \), the model portrays a time sharing and state-dependent MRP problem working under N-policy. For the state (0, n), the state probabilities are same as given by Eq. (24). For the state (1, n) using

    $$ D = \varLambda^{N + 1} + \sum_{k = 1}^{N - 1} {\frac{{({\varLambda^{N + 1 - k} \mu^k })}}{k!}} $$

    and

    $$ E = \left[{\frac{{({Y + S + 1})!}}{{\mu^{Y + S} \times \mu_{\text{a}} }}}\right]\left[{\frac{{({Y + S})\lambda_{Y + S} + \mu + \mu_{\text{a}} }}{{2({Y + S})\lambda_{Y + S} + \mu + \mu_{\text{a}} }}}\right] $$

    we have

    $$ \begin{aligned} P_{Y + S} (1) = & \left[{\frac{{({Y + S})\lambda_{Y + S} + \mu + \mu_{\text{a}} }}{{2({Y + S})\lambda_{Y + S} + \mu + \mu_{\text{a}} }}}\right] \\ \times \frac{(Y + S)!D}{{\mu^{Y + S} \times \lambda_{Y + S} }} \times \left[{\prod \limits_{i = 1}^s \lambda_{Y + i} }\right]P_{0,0} \\ \end{aligned} $$
    $$ \begin{aligned} P_{Y + S} (2) = & \frac{\mu }{{\mu_{\text{a}} }} \times \frac{{({Y + S})\lambda_{Y + S} }}{{2({Y + S})\lambda_{Y + S} + \mu + \mu_{\text{a}} }} \\ \times \frac{{({Y + S})!D}}{{\mu^{Y + S} }} \times \left[{ \prod \limits_{i = 1}^s \lambda_{Y + i} }\right]P_{0,0} \\ \end{aligned} $$

    Thus, we get the following probabilities:

    $$ P_{1.n} = \left\{ \begin{aligned} \frac{n!}{\mu^n }\left[{\varLambda^n + \sum \limits_{k = 1}^{n - 1} \frac{{({\varLambda^{n - k} \mu^k })}}{k!}}\right] P_{0,0} ; \;\;1 \le n \le N \hfill \\ \frac{{({N + 1})!}}{{\mu^{N + 1} }}[D]P_{0,0} ; \;\; n = N + 1 \hfill \\ \frac{n!D}{\mu^n }\left[{\prod \limits_{i = 1}^{n - Y - 1} \frac{{\lambda_{Y + i} }}{\mu }({Y + i})}\right]P_{0,0} ;\;\;Y + 2 \le n \le Y + S - 1 \hfill \\ P_{Y + S} (1) + P_{Y + S} (2);\;\;n = Y + S \hfill \\ \left[{\frac{{D({Y + S + 1})!}}{{\mu^{Y + S} \times \mu_{\text{a}} }}}\right] \hfill \\ \times \left[{\frac{{({Y + S})\lambda_{Y + S} + \mu_{\text{a}} }}{{({2Y + S})\lambda_{Y + S} + \mu + \mu_{\text{a}} }}}\right] \hfill \\ \times \left[{\prod \limits_{i = 1}^s \lambda_{Y + i} }\right] P_{0,0} ;\;\;n = Y + S + 1 \hfill \\ \left[{\prod \limits_{i = 1}^{n - ({Y + S}) - 1} \frac{{\lambda_{Y + S + i}^{\prime} ({Y + S + i + 1})}}{{({\mu + \mu_{\text{a}} })}}}\right] \hfill \\ \times \left[{\prod \limits_{i = 1}^s \lambda_{Y + i} }\right] \times {\text{DE}}\times{\text{P}}_{0,0} ;\;\; Y + S + 2 \le n \le L \hfill \\ \end{aligned} \right. $$
    (40)
  2. 2.

    If \( \phi (n) = 1 \), the model reduces to a state-dependent N-policy MRP model with additional repairman. In this case, for brevity the following notations have been used

    $$ G = \left[{\varLambda^{N + 1} + \sum \limits_{k = 1}^{N - 1} ({\varLambda^{N + 1 - k} \mu^k })} \right], $$
    $$ H = \left[{\frac{{\lambda_{Y + S} + \mu_{\text{a}} }}{{2\lambda_{Y + S} + \mu + \mu_{\text{a}} }}}\right] $$
    $$ P_{Y + S} (1) = \left[{\frac{GH}{{\mu^{Y + S} \times \mu_{\text{a}} }}}\right] \times \left[{\prod \limits_{i = 1}^s \lambda_{Y + i} }\right]P_{0,0} $$

    and \( P_{Y + S} (2) = \frac{\mu \times G}{{\mu_{\text{a}} \times \mu^{Y + S} }} \times \frac{1}{{2\lambda_{Y + S} + ({\mu + \mu_{\text{a}} })}} \times [{\prod \limits_{i = 1}^s \lambda_{Y + i} }]P_{0,0} \)

    The steady-state probabilities \( P_{0,n} \) can be obtained by Eq. (24). The steady-state probabilities \( P_{1,n} \) become:

    $$ P_{1.n} = \left\{ \begin{aligned} \frac{1}{\mu^n }\left[{\varLambda^n + \sum \limits_{k = 1}^{n - 1} ({\varLambda^{n - k} \mu^k })}\right] P_{0,0} ;\;\; 1 \le n \le N \hfill \\ \frac{1}{{\mu^{N + 1} }}[G]P_{0,0} ;\;\; n = N + 1 \hfill \\ \frac{G}{\mu^n }\left[{\prod \limits_{i = 1}^{n - Y - 1} \lambda_{Y + i} }\right]P_{0,0} ;\;\; Y + 2 \le n \le Y + S - 1 \hfill \\ P_{Y + S} (1) + P_{Y + S} (2); \;\;n = Y + S \hfill \\ \left[{\frac{GH}{{\mu^{Y + S} \times \mu_{\text{a}} }}}\right] \times \left[{\prod \limits_{i = 1}^s \lambda_{Y + i} }\right] P_{0,0} ;\;\; n = Y + S + 1 \hfill \\ \left[{\prod \limits_{i = 1}^{n - ({Y + S}) - 1} \frac{{\lambda_{Y + S + i}^{\prime} }}{{({\mu + \mu_{\text{a}} })}}}\right] \times \left[{\prod \limits_{i = 1}^s \lambda_{Y + i} }\right] \hfill \\ \times \frac{GH}{{\mu^{Y + S} \times \mu_{\text{a}} }}P_{0,0} ;\;\; + S + 2 \le n \le L \hfill \\ \end{aligned} \right. $$
    (41)
  3. 3.

    If \( \phi (n) = \frac{1}{n} \), N = 1, the machining system reduces to time sharing state-dependent queueing system. The repairman gets activated as soon as a machine fails, i.e., N-policy is not taken into account. The queue size distribution can be obtained by substituting \( N = 1 \) and \( \phi (n) = \frac{1}{n} \) in the Eqs. (25)–(27), (30) and (32)–(34).

  4. 4.

    If \( \phi (n) = 1 \), N = 1, the model provides results for a machining system with additional repairman but without N-policy and time sharing factor.

Performance indices

The performance indices of the concerned system can help the system engineer to develop an appropriate design for the concerned machining system. Using the probabilities obtained in the case I, some performance indices, viz. expected number of failed machines in the system, probability that the first repairman is busy, probability that both repairmen are busy, probability that the system is in accumulation state, throughput of the system and variance of the number of failed machines in the system have been established as follows:

  • The expected number of failed machines in the system is

    $$ E(n) = \sum \limits_{n = 0}^{N - 1} nP_{0,n} + \sum \limits_{n = 1}^L nP_{1,n} $$
    (42)
  • The probability of the system being in accumulation state is

    $$ P(A) = \sum \limits_{n = 0}^{N - 1} P_{0,n} $$
    (43)
  • The probability that only first permanent repairman being in busy state is

    $$ P({\text{FB}}) = \sum \limits_{n = 0}^{Y + S - 1} P_{1,n} $$
    (44)
  • The probability that both the repairmen being in busy state is

    $$ P({\text{BB}}) = \sum \limits_{n = Y + S}^L P_{1,n} $$
    (45)
  • The throughput of the time-shared system is

    $$ \tau = \mu \sum \limits_{n = 1}^{Y + S - 1} P_{1,n} + ({\mu + \mu_{\text{a}} }) \sum \limits_{n = Y + S}^L P_{1,n} $$
    (46)
  • The variance of the number of failed machines is

    $$ {\text{Var}}(n) = \sum \limits_{n = 1}^L n^2 P_{1,n} + - ({E(n)})^2 $$
    (47)

Cost function

The cost function for the time-shared machine repair problem has been constructed to make the system economic by the optimal choice of repair rates. It is desirable to reduce the cost as much as possible by setting the optimal service rate. For the concerned system, we define the cost factors associated with main activities as follows:

C f :

Cost per unit time for each failed machine present in the system

C a :

Cost per unit time in the accumulation state

C p :

Cost per unit time of the permanent repairman

C b :

Cost per unit time of the additional removable repairman

To achieve the maximum net profit, total average cost must be minimized. The total average cost is given by

$$ \begin{aligned} E\{{\text{TC}}\} =\; & C_f E(N) + C_a P(A) + C_p P({\text{FB}}) \\ + ({C_b + C_p })P({\text{BB}}) \\ \end{aligned} $$
(48)

Numerical analysis

To establish the utility of the performance model of the queueing system, the analytical solution is not enough as such it is important to do numerical simulation. The numerical results of the performance measures will be of great help to the system engineers and decision makers in improving and future designing the system. In this section, the sensitivity analysis is carried out for case I, by setting the default parameters for the numerical results depicted in Figs. 2, 3, 4 and Tables 1, 2, 3 as Y = 4, S = 3, λ = 0.3, α = 0.2, λ d = 0.4, µ = 0.5, µ a = 0.2. For Tables 1, 2, 3 and Figs. 3, 4, the numerical results are obtained for M = 5, 7, 9. To obtain the variation in the cost function E{TC} in Fig. 2, the results are obtained by setting M = 7. The effects of the failure rates of operating machines (λ, λ d), the failure rate of spares (α), service rates (µ, µ a) of the repairmen and the number of operating machines (M) have been examined on various performance measures such as the expectation E(n) and variance Var(n) of the number of failed machines in the system, throughput (τ) and the expected total cost E{TC} incurred on the system. The long run probability measures of the different states of the system like probability of the system being in accumulation state P(A), probability when the first permanent repairman is in busy state P(FB) and the probability when both repairmen are in busy state P(BB) have also been explored numerically for the variation in different parameters. Now, we discuss the sensitivity of the parameters as follows:

Fig. 2
figure 2

Total cost of the system by varying µ for different values of µ a

Fig. 3
figure 3

Expected number of failed machines by varying a λ, b α, c λ d for different values of M

Fig. 4
figure 4

Throughput of the system by varying a λ, b α, c λ d for different values of M

Table 1 Performance measures of the system by varying M and λ
Table 2 Performance measures of the system by varying M and α
Table 3 Performance measures of the system by varying M and µ
  • Effect of the failure rate of machines The failure rate of the operating machines (λ) affects the performance of a machining system significantly. It is clear from Table 1 that with the increase in the failure rate (λ) of the operating machines, the probability of both permanent and additional repairmen being busy P(BB) increases whereas the probability of the system being in accumulation state P(A), the probability of only first repairman being busy P(FB) and the Var(n) decrease. It is also observed from Figs. 3a and 4a that the queue length E(n) and the throughput (τ), respectively, increase with the increase in the failure rate of the operating machines. When all the spares have exhausted, the machines start failing with a degraded rate (λ d) due to overload. The queue length E(n) and the throughput (τ) of the system increase with the increase in the degraded failure rate (λ d) of the machines. A converging pattern is observed in the graphs shown in the Figs. 3c and 4c.

    The spares are also likely to fail with rate (α) and also affect the system performance considerably. Table 2 depicts a similar variation in the probabilities and variance by increasing the failure rate of the spares as observed by varying the failure rate (λ) of the operating machines in Table 1. The queue length E(n) and the throughput (τ) of the system increase gradually with the increase in the failure rate of the spares which can be seen in the Figs. 3b and 4b, respectively.

  • Effect of the repair rates of the permanent as well as additional repairmen The repair rates of the permanent as well as additional repairmen affect the total cost of the system remarkably. It is clear from Table 3 that the long run probabilities of the system in accumulation state P(A) and the P(BB) increases whereas the long run probabilities P(FB) and Var(n) decrease with the increase in the repair rate of the permanent repairman (µ).

  • Variation in the cost function The variation in the expected total cost of the system E{TC} has been observed for three different sets of cost parameters which are depicted in Fig. 2a–c. The different cost parameters set for the figures are as follows:

    1. I.

      C f = Rs. 50, C a = Rs. 10, C p = Rs. 80, C b = Rs. 100.

    2. II.

      C f = Rs. 100, C a = Rs. 50, C p = Rs. 200, C b = Rs. 300.

    3. III.

      C f = Rs. 100, C a = Rs. 50, C p = Rs. 250, C b = Rs. 250.

From Fig. 2a–c, it has been observed that the expected total cost of the system E{TC}increases with the increase in the repair rate of the additional repairman (µ). However, with the increase in the repair rate of the permanent repairman (µ a ), the total cost E{TC} first decreases and then increases, i.e., E{TC} shows the convexity with respect to repair rate (µ a ). From Fig. 2a, a minimum cost E{TC} = Rs. 407.88 is obtained at optimal repair rates of the permanent repairman and additional repairman at \(\mu = 1 \) and \( \mu_a = 1.37 \).

Now, we can conclude our results as:

  • The failure rates of both operating machines as well as standbys should be kept low to avoid the excessive workload at the repairman.

  • The degraded failure rate of the system should also be kept low, otherwise it will result in huge queue at the repairmen.

  • The repair rate of the additional repairman should be kept higher as compared to that of the permanent repairmen to minimize the overall cost of the system.

Discussion

In this investigation, threshold-based repair facility for the time-shared Markovian machine repair problem with mixed standbys under the care of one permanent repairman and one additional repairman has been studied. The features of mixed standbys, degraded failure and additional repairman incorporated in the model all together make our study more realistic and can be realized in several real world industrial organizations operating in multi-component machining environment. The repair rate of the failed operating machines and spare machines should be kept higher for smooth functioning of the system. The incorporation of threshold N-policy to turn on the permanent repairman makes our system cost effective and economic. It is realized in many machining systems that the permanent repairman cannot cope up with the increase in work load as such provision of additional repairman may be helpful in faster recovery of the failed machines. The numerical simulation of various performance indices facilitated will definitely provide insight to the system designers and industrial engineers to improve the efficiency and reliability of the concerned machining systems. The cost analysis carried out for the evaluation of minimum value of cost for a given set of other cost parameters signifies the validity and profitability of the model in a very effective manner and will be helpful to the decision makers in minimizing the cost of maintainability and in turn increase in the profit which is a highly desired trait of any organization. This work can be further extended by incorporating some more features, such as bulk failure and the switching failure.