1 Introduction

An important class of systems studied in reliability theory is one containing so called k-out-of-n systems. Such systems consist of n elements and work as long as at least k of the elements function. As technical structures having some redundancy, they find various applications in engineering when highly reliable products are needed, for example, they are used in design of servers in internet service or in design of automotive and aeronautic engines. Consequently, they have attracted substantial interest—a vast literature on k-out-of-n systems is available. One stream of this literature concerns inference about component lifetimes based on failure times in a k-out-of-n system or in a sample of k-out-of-n systems. Classic works of Halperin (1952) and Bhattacharyya (1985) describe asymptotic properties of MLE’s based on failure times of components of a k-out-of-n system. Generalizations of this results to the case when some of the failure times are censored can be found, among others, in Kong and Fei (1996) and Lin and Balakrishnan (2011). There are also works developing estimation methods for the distribution of components in a system based on a sample of lifetimes of systems; see, for example, Ng et al. (2012), Navarro et al. (2012), Hermanns and Cramer (2018) and the references therein.

All the above-mentioned results, however, concern only the case when the component lifetimes have absolutely continuous distributions. Yet, in some applications the continuity assumption is not adequate. This is the case, for instance, when the system performs a task repetitively and its components have certain probabilities of breakdown upon each cycle or when the component lifetimes represent the numbers of turn-on and switch-off up to failures. While reliability properties of k-out-of-n systems consisting of components with discrete lifetimes have been studied over the years; see Weiss (1962), Young (1970), Tank and Eryilmaz (2015), Dembińska (2018), Dembińska and Goroncy (2020) and Dembińska et al. (2019), to the best of our knowledge results concerning inference about discrete lifetimes of components based on failure times in k-out-of-n systems are not known.

The aim of this paper is to fill in this gap in the literature. We focus on maximum likelihood estimation of an unknown parameter of discrete distribution of component lifetimes of a k-out-of-n system. The estimation is based on failure times of components observed up to and including the system breakdown. In Sect. 2, we set our notation, describe the inference problem under consideration and point out that this problem can be viewed as equivalent to inference from a Type-II right censored sample. Next, in Sect. 3, we present a theorem asserting that under some mild regularity conditions the MLE’s of interest exist almost surely for all sufficiently large n and are strongly consistent. The proof of this theorem is postponed to the “Appendix”. In Sect. 4, we choose three typical discrete failure distributions—Poisson, binomial and negative binomial—to be the distributions of lifetimes of the components and show that then the MLE’s are unique if they exist, their values can be obtained easily by numerical methods and obtained MLE’s are strongly consistent. In Sect. 5, we perform Monte Carlo simulation study to investigate finite-sample properties of MLE’s discussed in Sect. 4. Section 6 contains an illustrative example based on real failure data while in Sect. 7 we give concluding remarks and problems for future investigations.

2 Maximum likelihood point estimation

Let \({\mathcal {F}}=\{F(\theta ,\cdot ), \,\theta \in \varTheta \}\) be a family of discrete cumulative distribution functions (cdf’s), where \(\theta \in \mathbf {R}\) is the parameter of interest. Consider a k-out-of-n system which consists of n two-state (i.e., working or failed) components. We assume that the lifetimes of the components, \(T_{1},T_{2},\ldots ,T_{n}\), are independent and identically distributed (iid) random variables (rv’s) with the common cdf \(F(\theta ,\cdot )\in \mathcal {F}\), so that \(F(\theta ,t)=P_{\theta }(T_{1}\le t)\). Next, we denote \(f(\theta ,t)=P_{\theta }(T_{1}=t)\), i.e., \(f(\theta ,\cdot )\) is the probability mass function (pmf) corresponding to \(F(\theta ,\cdot )\), and \(\overline{F}(\theta ,t)=1-F(\theta ,t)\). Moreover, for simplicity of notation we require that for any \(\theta \in \varTheta \) the support of \(F(\theta ,\cdot )\), denoted by \({\mathrm{supp}}\,F(\theta ,\cdot )\), is of the form \(\{0,1,\ldots ,M\}\), where \(M\le \infty \). Yet, it is easily seen that the results of Sect. 3 hold more generally in the case when \({\mathrm{supp}}\,F(\theta ,\cdot )=\{x_0,x_1,\ldots ,x_M\}\), \(M\le \infty \), where \(x_0<x_1<\cdots <x_M\) and if \(M=\infty \) then the sequence \((x_n, n\ge 0)\) has no accumulation points.

Our aim is to use the maximum-likelihood approach to estimate the unknown parameter \(\theta \) from the failure data collected up to and including a breakdown of a k-out-of-n system. Let \(T_{1{:}\,n}\le T_{2{:}\,n}\le \cdots \le T_{n{:}\,n}\) denote the order statistics corresponding to \(T_{1},T_{2},\ldots ,T_{n}\). A k-out-of-n system works as long as at least k of its components work. It fails when the \((n-k+1)\)th component failure occurs. Thus, the lifetime of k-out-of-n system is the \((n-k+1)\)th smallest of the component lifetimes, i.e., \(T_{n-k+1{:}\,n}\). However, in the case of discretely operating elements if \(k\ne 1\) then at the moment of the system failure we do not necessarily have exactly \(n-k+1\) inoperative elements—due to possible ties between component failures with non-zero probability the number of inoperative elements can be larger than \(n-k+1\); see Davies and Dembińska (2019) for details. Therefore, collecting data up to and including a breakdown of a k-out-of-n system we can register not only the values of \(T_{1{:}\,n}, T_{2{:}\,n},\ldots , T_{n-k+1{:}\,n}\) but also the value of S—the number of failed components at the moment of failure of the system. This means that we observe

$$\begin{aligned} S,T_{1{:}\,n}, T_{2{:}\,n},\ldots , T_{n-k+1{:}\,n}, \end{aligned}$$

or equivalently,

$$\begin{aligned} S,T_{1{:}\,n}, T_{2{:}\,n},\ldots , T_{S{:}\,n}. \end{aligned}$$

To express in a closed form the joint pmf of \(S,T_{1{:}\,n}, T_{2{:}\,n},\ldots , T_{S{:}\,n}\),

$$\begin{aligned} P_{\theta }(S=s,T_{1{:}\,n}=t_{1},\ldots ,T_{s{:}\,n}=t_{s}), \end{aligned}$$

and consequently to find the likelihood function of interest we follow an approach proposed by Gan and Bain (1995) based on the concept of tie-runs. Let \(s\in \{n-k+1,\ldots ,n\}\) and \(t_{1}\le t_{2}\le \cdots \le t_{s}\) have m tie-runs with lengths \(z_{1},z_{2},\ldots ,z_{m}\) \((z_{1}+\cdots +z_{m}=s)\), i.e.,

$$\begin{aligned} t_{1}=\cdots =t_{z_{1}}<t_{z_{1}+1}=\cdots =t_{z_{1}+z_{2}}<\cdots <t_{z_{1} +\cdots +z_{m-1}+1}=\cdots =t_{z_{1}+\cdots +z_{m}}(=t_{s}). \end{aligned}$$

Then the observed likelihood function of \(S,T_{1{:}\,n}, T_{2{:}\,n},\ldots , T_{S{:}\,n}\), given by

$$\begin{aligned} L(\theta )&=L(\theta ;\,s,\,t_1,\ldots ,t_{s})\\&=P_{\theta }(S=s,T_{1{:}\,n}=t_{1},\ldots , T_{n-k+1{:}\,n}=t_{n-k+1},\ldots ,T_{s{:}\,n}=t_{s})\\&=P_{\theta }(S=s,T_{1{:}\,n}=t_{1},\ldots , T_{n-k+1{:}\,n}=t_{n-k+1},\ldots ,T_{s{:}\,n}=t_{s},T_{s+1{:}\,n}>t_{s}), \end{aligned}$$

where \(T_{n+1{:}\,n}=\infty \), has the form

$$\begin{aligned} L(\theta )= \frac{n!}{(n-s)!\prod _{i=1}^{m}z_{i}!}\left( \prod _{i=1}^{m}[f(\theta ,t_{z_{1}+\cdots +z_{i}})]^{z_{i}}\right) \left[ \overline{F}(\theta ,t_{s})\right] ^{n-s} \end{aligned}$$
(1)

if \(s\in \{n-k+1,\ldots ,n\}\) and \(t_{n-k+1}=\cdots =t_s\). Otherwise the right-hand side of (1) reduces to 0.

If the derivatives \(\frac{\partial }{\partial \theta }f(\theta ,t)\), \(t\in \{0,1,\ldots ,M\}\), exist, then the observed likelihood equation \(\frac{\partial }{\partial \theta }\log L(\theta ;\,s,\,t_1,\ldots ,t_{s})=0\) can be written as

$$\begin{aligned} \sum _{i=1}^{m}z_i \frac{\partial \log f(\theta ,t_{z_{1}+\cdots +z_{i}})}{\partial \theta } +(n-s)\frac{\partial \log \overline{F}(\theta ,t_{s})}{\partial \theta } =0, \end{aligned}$$
(2)

where \(\frac{\partial }{\partial \theta } \log \overline{F}(\theta ,t_{s})\) is defined to be equal to 0 if \(t_s=M<\infty \).

In Sect. 3, we will prove that under some simple regularity conditions concerning the family \({\mathcal {F}}\), the likelihood equation

$$\begin{aligned} \frac{\partial }{\partial \theta }\log L(\theta ;\,S,\,T_{1{:}\,n},\ldots ,T_{s{:}\,n})=0 \end{aligned}$$
(3)

with \(P_{\theta }\)-probability 1, for all sufficiently large n, has a solution \(\hat{\theta }_n\) such that the sequence \((\hat{\theta }_n, n\ge 1)\) of estimators of \(\theta \) is strongly consistent. Next, in Sect. 4, we will show that for three families of typical discrete lifetime distributions the MLE of the parameter of interest is unique. Hence, by the result of Sect. 3 in the case of the three families the sequence of MLE’s is strongly consistent.

It is worth pointing out that the results presented in this paper, even though formulated in terms of inference from failure times of components of a k-out-of-n system up to and including its breakdown, can as well be applied to inference based on Type-II right censored discrete data. Indeed, during an experiment in which Type-II right censoring is applied n items with iid lifetimes \(T_1,T_2,\ldots ,T_n\) are placed on a test. Due to budget or time limitations or on account of ethical decisions in biomedical problems, the experiment is terminated at the moment of the rth failure, where \(r<n\) is fixed in advance. If the lifetimes \(T_i\), \(i=1,\ldots ,n\), are discrete rv’s, then with non-zero probability it may happen that at the moment of the rth failure more than r items are broken. Clearly, in order not to lose any information it is reasonable to make the inference not only from the values of \(T_{1{:}\,n}, T_{2{:}\,n},\ldots , T_{r{:}\,n}\) but to include also the value of S—the number of failed items at the time of the rth failure. Therefore, the problem is equivalent to inference based on \(S,T_{1{:}\,n}, T_{2{:}\,n},\ldots , T_{r{:}\,n}\), and with \(r=n-k+1\) it is exactly the same problem as inference from failure times of components of a k-out-of-n system up to and including its breakdown. To the best of our knowledge maximum likelihood inference for discrete distributions based on censored data has not been studied before in the literature.

3 Strong consistency

The standard theorems of asymptotic theory of MLE’s constructed from iid observations do not apply to our problem in which we make inference from dependent and non-identically distributed rv’s \(S,T_{1{:}\,n},\ldots ,T_{n-k+1{:}\,n}\). However, as will be shown later on, the basic machinery of proving that under some regularity conditions MLE’s from iid observations are strongly consistent can be modified to derive the following analogous result for MLE’s obtained from failure data of a k-out-of-n system.

Theorem 1

Assume that the family \({\mathcal {F}}=\{F(\lambda ),\lambda \in \varTheta \}\) satisfies the following three conditions:

  1. (A1)

    \(\varTheta \subset \mathbf {R}\) is an open interval (possibly infinite); 

  2. (A2)

    for all \(\lambda \in \varTheta \) and \(j\in \{0,1,\ldots ,M\}\), \(\frac{\partial ^{3}f(\lambda ,j)}{\partial \lambda ^{3}}\) exists and is a continuous function of \(\lambda \in \varTheta ;\)

  3. (A3)

    \(\frac{\partial \log f(\lambda ,0)}{\partial \lambda }\ne 0\) for all \(\lambda \in \varTheta \).

Let \(T_{1},\ldots ,T_{n}\) be iid rv’s with cdf \(F(\theta ,\cdot )\) for some \(\theta \in \varTheta \). If \(k=k(n)=[(1-~q)n]\), \(n\ge 1\), where \(q\in (0,1)\) and [x] stands for the largest integer not exceeding x, then there exists a sequence \((\hat{\theta }_{n},n\ge 1)\) such that, with \(P_{\theta }\)-probability 1,

  • for all sufficiently large n, \(\hat{\theta }_{n}\) is a solution to the likelihood equation (3);

  • \(\hat{\theta }_{n}\rightarrow \theta \) as \(n\rightarrow \infty \).

Proof

See the “Appendix”. \(\square \)

Theorem 1 can be used in practice, because for a given family \({\mathcal {F}}=\{F(\theta ,\cdot ),\) \(\theta \in \varTheta \}\) we can check if its assumptions are satisfied without knowing the value of the true parameter \(\theta \). In particular, this theorem will allow us to deduce that the MLE’s obtained in the next section are strongly consistent.

4 MLE’s for some specific families of distributions

In this section, we will consider the Poisson \(\mathrm{Poiss}(\theta )\), \(\theta >0\), binomial \(b(w,\theta )\), \(\theta \in (0,1)\), and negative binomial \(\mathrm{nb}(w,\theta )\), \(\theta \in (0,1)\), distributions as possible component lifetime distributions of a k-out-of-n system. These three distributions, besides the geometric one, are listed by Barlow and Proschan (1996) as typical discrete failure distributions widely used in reliability engineering. We will prove that in the case of all these discrete distributions if the MLE of the parameter \(\theta \) based on observed values of \(S,T_{1{:}\,n},\ldots , T_{S{:}\,n}\) exists then it is unique. Hence, Theorem 1 will guarantee that in these cases the MLE of \(\theta \) exists almost surely for sufficiently large n and is strongly consistent.

It is worth pointing out that, since the geometric distribution is a special case of the negative binomial distribution, results proved here for the negative binomial lifetimes of components hold in particular for geometrically distributed lifetimes. Yet the geometric case is easier—then a closed-form formula for the MLE of \(\theta \) can be obtained and hence not only asymptotic but also exact distributional properties of this estimator can be given. For this purpose, the geometric case will be considered in details in a separate paper.

To prove results of this section, we will make use of the following two lemmas. The first one is taken from Pólya and Szegő (1998, p. 41).

Lemma 1

Let the radius of convergence of the power series \(\sum _{i=0}^{\infty }\alpha _{i}x^{i}\) be \(\rho \in (0,\infty ]\), let the number of its zeros in the interval \(0<x<\rho \) be Z and let the number of changes of sign in the sequence of its coefficients be C. Then \(Z\le C\).

The second lemma concerns linear combinations of Bernstein polynomials and was first proved by Schoenberg (1959). Recall that Bernstein polynomials of degree w are defined by \(B_{j,w}(x)=\left( {\begin{array}{c}w\\ j\end{array}}\right) x^{j}(1-x)^{w-j}\), \(x\in (0,1)\), \(j=0,\ldots ,w\).

Lemma 2

The number of zeros of a given nonzero linear combination of Bernstein polynomials \(B(x)=\sum _{i=0}^{n}\beta _{i}B_{i,n}(x)\), \(x\in (0,1)\), does not exceed the number of sign changes of the sequence \({\beta }=(\beta _{0},\ldots ,\beta _{n})\). The first and the last signs of the sum are identical to the signs of the first and the last nonzero element of \({\beta }\), respectively.

From now on we will assume that the observed values of \(S,T_{1{:}\,n},\) \(\ldots , T_{S{:}\,n}\) are equal to \(s, t_1,\ldots ,t_s\), respectively, where \(s\in \{n-k+1,\ldots ,n\}\), \(t_{n-k+1}=\cdots =t_s\) and the chain \(t_{1}\le t_{2}\le \ldots \le t_{s}\) have m tie-runs with lengths \(z_{1},z_{2},\ldots ,z_{m}\) \((z_{1}+\cdots +z_{m}=s)\). Furthermore, for simplicity of notation, we will write

$$\begin{aligned} \delta =z_{1}t_{z_{1}}+z_{2}t_{z_{1}+z_{2}}+\cdots +z_{m}t_{s}. \end{aligned}$$
(4)

4.1 Poisson distribution

Let the component lifetimes \(T_{i}\), \(i=1,\ldots ,n,\) have the Poisson distribution \(\mathrm{Poiss}(\theta )\) with a pmf

$$\begin{aligned} f(\theta ,t)=e^{-\theta }\,\frac{\theta ^{t}}{t!}, \quad t\in \,\{0,1,2,\ldots \}, \end{aligned}$$
(5)

where \(\theta >0\) is the parameter to estimate. Then by (1) and (4) the observed likelihood function of \(S,T_{1{:}\,n},\ldots , T_{S{:}\,n}\) can be written as

$$\begin{aligned} L(\theta )=C_{1}e^{-\theta n}\theta ^{\delta }\left( \sum _{j=t_{s}+1}^{\infty } \frac{\theta ^{j}}{j!}\right) ^{n-s}, \theta >0, \end{aligned}$$

where \(C_1\) does not depend on \(\theta \). Simple calculations show that

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}\theta }\log L(\theta )=-n+\frac{\delta }{\theta }+(n-s) \sum _{j=t_{s}}^{\infty }\frac{\theta ^{j}}{j!} \left( \sum _{j=t_{s}+1}^{\infty }\frac{\theta ^{j}}{j!}\right) ^{-1}, \; \theta >0. \end{aligned}$$
(6)

From (6) it is easily seen that if \(s=n\), then the function \(L(\theta )\), \(\theta >0\), has a global maximum at \(\delta /n\) if \(\delta >0\), and does not attain a global maximum if \(\delta =0\). Hence, the MLE does not exist when \(T_1=T_2=\cdots =T_n=0.\) We see at once that the probability of such an event, \(e^{-n\theta }\), decreases to 0 as \(n\rightarrow \infty \). It is worth also noting that if \(s=n\) and \(\delta >0\), that is if we observe the following event \(\{T_{n-k+1{:}\,n}=T_{n-k+2{:}\,n}=\cdots =T_{n{:}\,n}>0\}\), then the MLE is just equal to the sample mean.

It remains to consider the case when \(s\in \{n-k+1,\ldots ,n-1\}\). For this purpose, note that (6) can be rewritten as

$$\begin{aligned}&\frac{\mathrm{d}}{\mathrm{d}\theta }\log L(\theta ) \nonumber \\&\quad =\left( \sum _{j=t_{s}+1}^{\infty }\frac{\theta ^{j+1}}{j!}\right) ^{-1} \left( -n\sum _{j=t_{s}+1}^{\infty }\frac{\theta ^{j+1}}{j!}+\delta \sum _{j=t_{s}+1}^{\infty }\frac{\theta ^{j}}{j!} +(n-s)\sum _{j=t_{s}}^{\infty }\frac{\theta ^{j+1}}{j!}\right) \nonumber \\&\quad = \left( \sum _{j=t_{s}+1}^{\infty }\frac{\theta ^{j+1}}{j!}\right) ^{-1} \left\{ \big (\delta +(n-s)(t_{s}+1)\big ) \frac{\theta ^{t_{s}+1}}{(t_{s}+1)!} +\sum _{j=t_{s}+2}^{\infty }(\delta -js) \frac{\theta ^{j}}{j!}\right\} \nonumber \\&\quad = \left( \sum _{j=t_{s}+1}^{\infty } \frac{\theta ^{j+1}}{j!}\right) ^{-1} h(\theta ), \hbox { say}. \end{aligned}$$
(7)

From (7) it is clear that \(\frac{\mathrm{d}}{\mathrm{d}\theta }\log L(\theta )\) has the same sign as \( h(\theta )\). But \( h(\theta )\) can be represented as

$$\begin{aligned} h(\theta )=\sum _{j=0}^{\infty } \alpha _{s,t_{1},\ldots ,t_{s}}(j)\theta ^{j},\; \theta >0, \end{aligned}$$
(8)

where

$$\begin{aligned} \alpha _{s,t_{1},\ldots ,t_{s}}(j)=\left\{ \begin{array}{ll} 0, &{}\quad j=0,\ldots ,t_{s},\\ \big ((n-s)(t_{s}+1)+\delta \big )/(j!), &{}\quad j=t_{s}+1,\\ (\delta -sj)/(j!), &{}\quad j=t_{s}+2,t_{s}+3,\ldots . \end{array} \right. \end{aligned}$$

We see at once that \(\big ((n-s)(t_{s}+1)+\delta \big )/(j!)>0\) since \(s<n\) and \(\delta \ge 0\). Moreover, \((\delta -sj)/(j!)<0\) for \(j\ge t_{s}+2\), because

$$\begin{aligned} \delta -sj=z_{1}(t_{z_{1}}-j)+z_{2}(t_{z_{1}+z_{2}}-j)+\cdots +z_{m}(t_{s}-j) \end{aligned}$$

and \(t_{z_{1}+\cdots +z_{i}}-j<0\) for \(i=1,\ldots ,m\) and \(j\ge t_{s}+2\), which is due to the fact that \(0\le t_{z_{1}}<t_{z_{1}+z_{2}}<\cdots <t_{z_{1}+\cdots +z_{m}}=t_s\). Consequently, the number of sign changes in the sequence \((\alpha _{s,t_{1},\ldots ,t_{s}}(j), j\ge 0)\) equals one. The radius of convergence of the power series in (8) is \(\rho =\infty \). Therefore, Lemma 1 guarantees that the number of zeros of \(h(\theta )\) in the interval \((0,\infty )\) is at most one. But from (7) we see that \(h(\theta )=(n-s)\theta ^{t_s+1}/(t_s!)+(\delta -s\theta ) \sum _{j=t_s+2}^{\infty }\theta ^j/(j!)\) and consequently

$$\begin{aligned} h\left( \delta /s\right) =(n-s) \frac{\left( \delta /s\right) ^{t_{s}+1}}{t_{s}!}>0 \end{aligned}$$
(9)

and

$$\begin{aligned} h\left( \frac{\delta +(n-s)(t_{s}+1)}{s}\right) =-(n-s)(t_{s}+1)\sum _{j=t_{s}+2}^{\infty } \frac{\big (\delta +(n-s)(t_{s}+1)\big )^{j}}{s^jj!}<0. \end{aligned}$$
(10)

Hence, \(h(\theta )\) has exactly one zero in \((0,\infty )\), which by (7) shows that the likelihood equation

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}\theta }\log L(\theta ) =0 \end{aligned}$$
(11)

has exactly one solution in \((0,\infty )\). Moreover, the function \(L(\theta )\) is first increasing, then decreasing, which implies it attains its global maximum and the observed MLE of \(\theta \), being the solution to (11), is unique. From (9) and (10) we know that the observed MLE of \(\theta \) belongs to the finite interval \(\left( \frac{\delta }{s},\frac{\delta +(n-s)(t_{s}+1)}{s}\right) \) and therefore can be found easily through numerical methods.

Thus, we have proved the following theorem

Theorem 2

From the Poisson distribution with pmf given in (5), suppose we have observed failure times of components of a k-out-of-n system up to and including the breakdown of the system \(S=s,T_{1{:}\,n}=t_{1},\ldots ,T_{s{:}\,n}=t_{s}\).

  1. (1)

    Then \(\hat{\theta }_{\mathrm{ML},n}(s,t_1,\ldots ,t_s)\), the observed MLE of \(\theta \), is unique provided it exists. More precisely, we have

    • \(\hat{\theta }_{\mathrm{ML},n}(s,t_1,\ldots ,t_s)\) does not exists if \(s=n\) and \(\delta =0\) (i.e., if \(t_1=t_2\) \(=\cdots =t_n=0\)),

    • \(\hat{\theta }_{\mathrm{ML},n}(s,t_1,\ldots ,t_s)=\delta /n\) if \(s=n\) and \(\delta >0\),

    • \(\hat{\theta }_{\mathrm{ML},n}(s,t_1,\ldots ,t_s)\) is unique, belongs to the interval \(\left( \frac{\delta }{s},\frac{\delta +(n-s)(t_{s}+1)}{s}\right) \) and hence can be obtained easily by numerical methods if \(s\in \{n-~k+~1,\) \(\ldots ,n-1\}\).

  2. (2)

    Moreover, by Theorem 1, if \(k=[np]\), \(0<p<1\), then almost surely, for all sufficiently large n, \(\hat{\theta }_{\mathrm{ML},n}=\hat{\theta }_{\mathrm{ML},n}(S,T_{1{:}\,n},\ldots , T_{S{:}\,n})\) exists and \(\hat{\theta }_{\mathrm{ML},n}\) is a strongly consistent estimator of \(\theta \).

4.2 Binomial distribution

Now suppose that the component lifetimes \(T_{i}\), \(i=1,\ldots ,n,\) of a k-out-of-n system have the binomial distribution \(b(w,\theta )\) with the following pmf

$$\begin{aligned} f(\theta ,t)=\left( {\begin{array}{c}w\\ t\end{array}}\right) \theta ^{t}(1-\theta )^{w-t},\quad t\in \,\{0,1,\ldots ,w\}, \end{aligned}$$
(12)

where \(w\in \{1,2,\ldots \}\) is known and \(\theta \in (0,1)\) is the parameter to estimate. With the notation (4), the observed likelihood function (1) is given by

$$\begin{aligned} L(\theta )=C_{2}\theta ^{\delta }(1-\theta )^{ws-\delta }\left[ \sum _{j=t_{s}+1}^{w}\left( {\begin{array}{c}w\\ j\end{array}}\right) \theta ^{j}(1-\theta )^{w-j}\right] ^{n-s}, \; \theta \in (0,1), \end{aligned}$$
(13)

where \(C_{2}\) does not depend on \(\theta \). An easy computation shows that, for \( \theta \in (0,1)\),

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}\theta }\log L(\theta )=\frac{\delta }{\theta }-\frac{ws-\delta }{1-\theta } +(n-s) \frac{\sum _{j=t_{s}+1}^{w}(j-w\theta )\left( {\begin{array}{c}w\\ j\end{array}}\right) \theta ^{j-1}(1-\theta )^{w-j-1}}{\sum _{j=t_{s}+1}^{w}\left( {\begin{array}{c}w\\ j\end{array}}\right) \theta ^{j}(1-\theta )^{w-j}}. \end{aligned}$$
(14)

If \(s=n\) then we see from (14) that the function \(L(\theta )\), \(\theta \in (0,1)\), has a global maximum at \(\delta /(wn)\) if \(\delta >0\), and does not attain a global maximum if \(\delta =0\). Hence, similarly to the Poisson case, the MLE does not exist when \(T_1=T_2=\cdots =T_n=0\) and it is easily seen that the probability of non-existence, \((1-\theta )^{nw}\), approaches 0 as \(n\rightarrow \infty \). Moreover, if \(T_{n-k+1{:}\,n}=T_{n-k+2{:}\,n}=\cdots =T_{n{:}\,n}>0\), then the MLE is equal to the sample mean divided by w.

The case of \(s\in \{n-k+1,\ldots ,n-1\}\) requires more effort. Note that (14) can be rewritten as

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}\theta }\log L(\theta )=(1-\theta )^{-1}\left\{ \sum _{j=t_{s}+1}^{w} \left( {\begin{array}{c}w\\ j\end{array}}\right) \theta ^{j}(1-\theta )^{w-j}\right\} ^{-1}g(\theta ), \end{aligned}$$
(15)

where

$$\begin{aligned} g(\theta )&=\left\{ \frac{1-\theta }{\theta } \delta -(ws-\delta )\right\} \sum _{j=t_{s}+1}^{w} \left( {\begin{array}{c}w\\ j\end{array}}\right) \theta ^{j}(1-\theta )^{w-j}\\&\quad +\,(1-\theta )(n-s)\sum _{j=t_{s}+1}^{w} \left( {\begin{array}{c}w\\ j\end{array}}\right) \left( j-w\theta \right) \theta ^{j-1}(1-\theta )^{w-j-1}. \end{aligned}$$

Clearly \(g(\theta )\) has the same sign as \(\frac{\mathrm{d}}{\mathrm{d}\theta }\log L(\theta )\). Moreover, it is easy to check that \(g(\theta )\) can be represented as the following linear combination of Bernstein polynomials \(B_{j,w}(x)=\left( {\begin{array}{c}w\\ j\end{array}}\right) x^{j}(1-x)^{w-j},\) \(x\in (0,1)\), \(j=0,1,\ldots ,w\),

$$\begin{aligned} g(\theta )= & {} \sum _{j=t_{s}}^{w-1}\left( \delta \frac{w-j}{j+1}+(n-s)(w-j)\right) B_{j,w}(\theta ) \\&-\sum _{j=t_{s}+1}^{w}\big ((ws-\delta )+(n-s)(w-j)\big )B_{j,w}(\theta ) \\= & {} \sum _{j=0}^{w} \beta _{s,t_{1},\ldots ,t_{s}}(j)B_{j,w}(\theta ),\quad \theta \in (0,1), \end{aligned}$$

where

$$\begin{aligned} \beta _{s,t_{1},\ldots ,t_{s}}(j)=\left\{ \begin{array}{ll} 0, &{}\quad j=0,\ldots ,t_{s}-1,\\ \frac{w-t_{s}}{t_{s}+1}\big (\delta +(n-s)(t_{s}+1)\big ), &{}\quad j=t_{s},\\ \frac{\delta (w+1)-ws(j+1)}{j+1}, &{}\quad j=t_{s}+1,\ldots ,w-1.\\ -(ws-\delta ), &{}\quad j=w. \end{array} \right. \end{aligned}$$

The coefficient \(\beta _{s,t_{1},\ldots ,t_{s}}(t_{s})\) is positive since \(t_s<w\) when \(s<n\). Now for \(j=t_{s}+1,\ldots ,w-1\) we check that \(\beta _{s,t_{1},\ldots ,t_{s}}(j)<0\), which is equivalent to the inequality

$$\begin{aligned}&\delta (w+1)-ws(j+1) \nonumber \\&\quad =z_{1}\{w(t_{z_{1}}-j)+(t_{z_{1}}-w)\}+\cdots +z_{m} \{w(t_{s}-j)+(t_{s}-w)\}<0. \end{aligned}$$
(16)

But

$$\begin{aligned} t_{z_{1}}<t_{z_{1}+z_{2}}<\cdots<t_{s}< w, \end{aligned}$$
(17)

which shows that for \(j=t_{s}+1,\ldots ,w-1\) the expressions in the braces in (16) are negative and hence (16) holds. Finally, we verify that \(\beta _{s,t_{1},\ldots ,t_{s}}(w)<0\). This corresponds to the inequality

$$\begin{aligned} z_{1}(t_{z_{1}}-w)+z_{2}(t_{z_{1}+z_{2}}-w)+\cdots +z_{m}(t_{s}-w)<0, \end{aligned}$$

which is true because of the relation (17). Summarizing we have proved that

$$\begin{aligned} \text{ sgn }(\beta _{s,t_{1},\ldots ,t_{s}}(j))=\left\{ \begin{array}{ll} 0, &{}\quad j=0,\ldots ,t_{s}-1,\\ +\,1, &{}\quad j=t_{s},\\ -\,1, &{}\quad j=t_{s}+1,\ldots ,w. \end{array} \right. \end{aligned}$$

Lemma 2 now ensures that the sign of the derivative (15) is first positive and then negative on (0, 1). From this we conclude that the observed likelihood function (13) is first increasing and then decreasing there. Hence, it has a global maximum in (0, 1) which is attained at the point being the only solution to the observed likelihood equation. Therefore, the observed MLE of \(\theta \) exists and is unique.

Thus, we have the following analogue of Theorem 2.

Theorem 3

From the binomial distribution with pmf given in (12), suppose we have observed failure times of components of a k-out-of-n system up to and including the breakdown of the system \(S=s,T_{1{:}\,n}=t_{1},\ldots ,T_{s{:}\,n}=t_{s}\).

  1. (1)

    Then \(\hat{\theta }_{\mathrm{ML},n}(s,t_1,\ldots ,t_s)\), the observed MLE of \(\theta \), is unique provided it exists. More precisely, we have

    • \(\hat{\theta }_{\mathrm{ML},n}(s,t_1,\ldots ,t_s)\) does not exists if \(s=n\) and \(\delta =0\),

    • \(\hat{\theta }_{\mathrm{ML},n}(s,t_1,\ldots ,t_s)=\frac{\delta }{nw} \) if \(s=n\) and \(\delta >0\),

    • \(\hat{\theta }_{\mathrm{ML},n}(s,t_1,\ldots ,t_s)\) is unique and can be obtained easily by numerical methods if \(s\in \{n-~k+~1, \ldots ,n-1\}\).

  2. (2)

    Moreover, conclusion (2) of Theorem 2 holds.

4.3 Negative binomial distribution

Consider a k-out-of-n system composed of n components whose lifetimes \(T_{i}\), \(i=1,\ldots ,n\), have the negative binomial distribution \(\mathrm{nb}(w,\theta )\) with a pmf

$$\begin{aligned} f(\theta ,t)=\left( {\begin{array}{c}t+w-1\\ w-1\end{array}}\right) \theta ^{t}(1-\theta )^{w},\quad t\in \{0,1,2,\ldots \}, \end{aligned}$$
(18)

where \(w\in \{1,2,\ldots \}\) is known and \(\theta \in (0,1)\) is the parameter to estimate. Then the observed likelihood function (1) takes on the form

$$\begin{aligned} L(\theta )= & {} C_{3}\theta ^{\delta }(1-\theta )^{sw}\left\{ 1-\sum _{j=0}^{t_{s}} \left( {\begin{array}{c}j+w-1\\ w-1\end{array}}\right) \theta ^{j}(1-\theta )^{w}\right\} ^{n-s}\nonumber \\= & {} C_{3}\theta ^{\delta }(1-\theta )^{nw} \left\{ \sum _{j=t_{s}+1}^{\infty }\left( {\begin{array}{c}j+w-1\\ w-1\end{array}}\right) \theta ^{j}\right\} ^{n-s}, \theta \in (0,1), \end{aligned}$$
(19)

where \(C_{3}\) does not depend on \(\theta \) and \(\delta \) is given in (4). The observed likelihood equation (2) becomes

$$\begin{aligned} \frac{\delta }{\theta }-\frac{nw}{1-\theta }+(n-s) \sum _{j=t_{s}+1}^{\infty }\left( {\begin{array}{c}j+w-1\\ w-1\end{array}}\right) j\theta ^{j-1} \left\{ \sum _{j=t_{s}+1}^{\infty }\left( {\begin{array}{c}j+w-1\\ w-1\end{array}}\right) \theta ^{j}\right\} ^{-1}=0, \end{aligned}$$
(20)

or equivalently

$$\begin{aligned} \{\delta (1-\theta )-nw\theta \} \sum _{j=t_{s}+1}^{\infty } \left( {\begin{array}{c}j+w-1\\ w-1\end{array}}\right) \theta ^{j} +(1-\theta )(n-s)\sum _{j=t_{s}+1}^{\infty } \left( {\begin{array}{c}j+w-1\\ w-1\end{array}}\right) j\theta ^{j}=0. \end{aligned}$$
(21)

If \(s=n\) and \(\delta =0\), then the function \(L(\theta )\) given in (19) is decreasing and consequently the observed MLE of \(\theta \) does not exist. It is obvious that the probability of non-existence, \(P(T_1=T_2=\cdots =T_n=0)=(1-\theta )^{nw}\), decreases to 0 as \(n\rightarrow \infty \). Otherwise, that is when \(s<n\) or \(\delta >0\), we have

$$\begin{aligned} \lim _{\theta \searrow 0}L(\theta )=0=\lim _{\theta \nearrow 1}L(\theta ) \end{aligned}$$

and, since \(L(\theta )\), \(\theta \in (0,1)\), is continuous and positive, it has a global maximum. In the case when \(s=n\) and \(\delta >0\) we easily see from (20) that this global maximum is attained at \(\theta =\delta /\left( nw+\delta \right) \) and hence that the observed MLE of \(\theta \) is equal to

$$\begin{aligned} \hat{\theta }_{\mathrm{ML},n}(s,t_1,\ldots ,t_s) =\frac{\delta }{nw+\delta }=\frac{\bar{t}}{w+\bar{t}} \hbox { if } s=n \hbox { and } \delta >0, \end{aligned}$$

where \(\bar{t}\) is the observed sample mean.

It remains to consider the case when \(s\in \{n-k+1,\ldots ,n-1\}\). For this purpose, note that the left-hand side of (21) can be represented as the following power series

$$\begin{aligned}&\sum _{j=t_{s}+1}^{\infty }\big (\delta +j(n-s)\big ) \left( {\begin{array}{c}j+w-1\\ w-1\end{array}}\right) \theta ^{j}\nonumber \\&\quad -\sum _{j=t_{s}+2}^{\infty } \left( \delta +nw+(n-s)(j-1)\right) \left( {\begin{array}{c}j+w-2\\ w-1\end{array}}\right) \theta ^{j}\nonumber \\&\quad =\sum _{j=0}^{\infty }\gamma _{s,t_{1},\ldots ,t_{s}}(j)\theta ^{j}, \end{aligned}$$
(22)

where

$$\begin{aligned} \gamma _{s,t_{1},\ldots ,t_{s}}(j)=\left\{ \begin{array}{ll} 0, &{}\quad j=0,\ldots ,t_{s},\\ \left( {\begin{array}{c}w+t_{s}\\ w-1\end{array}}\right) \big (\delta +(n-s)(t_{s}+1)\big ), &{}\quad j=t_{s}+1,\\ \left( {\begin{array}{c}j+w-2\\ w-1\end{array}}\right) \frac{\delta (w-1)-swj}{j},&\quad j=t_{s}+2,t_{s}+3,\ldots . \end{array} \right. \end{aligned}$$

Hence

$$\begin{aligned} {\mathrm{sgn}} (\gamma _{s,t_{1},\ldots ,t_{s}}(j))=\left\{ \begin{array}{ll} 0, &{}\quad j=0,\ldots ,t_{s},\\ +\,1, &{}\quad j=t_{s}+1,\\ -\,1, &{}\quad j=t_{s}+2,t_{s}+3,\ldots . \end{array} \right. \end{aligned}$$
(23)

Indeed, if \(s<n\) then \(\delta +(n-s)(t_{s}+1)>0\), which implies \(\gamma _{s,t_{1},\ldots ,t_{s}}(t_{s}+1)>0\). Moreover, if \(j\ge t_s+2\) then \(j>t_{s}>\cdots>t_{z_{1}+z_{2}}>t_{z_{1}}\) and consequently

$$\begin{aligned}&\delta (w-1)-swj = z_{1}\big ((w-1)t_{z_{1}}-wj\big ) \\&\quad +\,z_{2}\big ((w-1)t_{z_{1}+z_{2}}-wj\big )+\cdots +z_{m} \big ((w-1)t_{s}-wj\big )<0, \end{aligned}$$

which shows that \(\gamma _{s,t_{1},\ldots ,t_{s}}(j)<0\) for \(j\ge t_{s}+2\).

Since the radius of convergence of the power series in (22) is \(\rho =1\), from Lemma 1 and (23) we obtain that the left-hand side of (21) [or equivalently of (20)] considered as a function of \(\theta \) has at most one zero in the interval (0, 1). But from the previous discussion, we know that the function \(L(\theta )\), \(\theta \in (0,1)\), has a global maximum. Therefore, the left-hand side of (20) has exactly one zero in (0, 1) and this zero is a point at which the likelihood function \(L(\theta )\) attains its global maximum. The observed MLE of \(\theta \) is unique and can be obtained easily by numerical methods.

Thus, we have proved the following result.

Theorem 4

From the negative binomial distribution with pmf given in (18), suppose we have observed failure times of components of a k-out-of-n system up to and including the breakdown of the system \(S=s,T_{1{:}\,n}=t_{1},\ldots ,T_{s{:}\,n}=t_{s}\).

  1. (1)

    Then conclusion (1) of Theorem 3 holds with “\(\frac{\delta }{nw}\)” replaced by “\(\frac{\delta }{nw+\delta }\)”.

  2. (2)

    Moreover, conclusion (2) of Theorem 2 is valid.

5 Monte Carlo simulation study

From Sect. 4 we know that in the case of Poisson \(\mathrm{Poiss}(\theta )\), binomial \(b(w,\theta )\) and negative binomial \(\mathrm{nb}(w,\theta )\) distributions the maximum likelihood estimators of \(\theta \) based on failure times of components of a k-out-of-n system observed up to and including the breakdown of the system are strongly consistent as \(n\rightarrow \infty \) and \(k=[pn]\), where \(p\in (0,1)\) is fixed. The aim of this section is to investigate finite-sample properties of these estimators via Monte Carlo simulation study. For this purpose we assume Poisson \(\mathrm{Poiss}(\theta =1)\), binomial \(b(w=4,\theta =0.5)\) and negative binomial \(nb(w=5,\theta =0.15)\) component lifetimes. The parameters of these distributions were chosen so that the corresponding variances are equal (in the case of the Poisson and binomial distributions) or approximately equal (in the case of the negative binomial distribution) to one. The almost equal variances allow to make comparisons between the three considered cases. Next, for each of the chosen distributions and for some selected values of n and k we generate \(N=1000\) times the failure times of components of a k-out-of-n system observed up to and including the system breakdown obtaining the data of the form \(s^{(i)}\), \(t_{1}^{(i)}\le \cdots \le t_{n-k+1}^{(i)}=\cdots =t_{s^{(i)}}^{(i)}\), \(i=1,\ldots ,N\). For each \(i=1,\ldots ,N\) we then compute \(\hat{\theta }_{\mathrm{ML}}^{(i)}\), the maximum likelihood estimator of \(\theta \), using numerical methods if necessary. More precisely, to solve the corresponding likelihood equation we use the method of finding the unique root of a continuous function in a finite interval, such as the bisection method. Finally, we compute the mean and standard deviation of \(\hat{\theta }_{\mathrm{ML}}^{(i)}\), \(i=1,\ldots ,N\). These values can be treated as the simulated expectation and standard deviation of \(\hat{\theta }_{\mathrm{ML}}\). The obtained results are presented in Tables 1, 2 and 3. It is interesting that during the simulations we did not encounter samples with non-existing MLE’s. This was so because for the cases considered in the tables the probabilities of non-existence are very small as Table 4 shows.

Table 1 Simulated means and standard deviations of \(\hat{\theta }_{\mathrm{ML}}\) for various values of n and k when component lifetimes have the \(\mathrm{Poiss}(\theta =1)\) distribution
Table 2 Simulated means and standard deviations of \(\hat{\theta }_{\mathrm{ML}}\) for various values of n and k when component lifetimes have the binomial \(b(w=4,\theta =0.5)\) distribution
Table 3 Simulated means and standard deviations of \(\hat{\theta }_{\mathrm{ML}}\) for various values of n and k when component lifetimes have the negative binomial distribution nb\((w=5,\theta =0.15)\) distribution
Table 4 Selected approximate values of probabilities of non-existence of MLE (formulas for these probabilities are taken from Sect. 4)

In the simulation study we observe that even for small n (\(n=15\)) the bias of \(\hat{\theta }_{\mathrm{ML}}\) is small—the simulated expectations of \(\hat{\theta }_{\mathrm{ML}}\) are close to the true values of \(\theta \). As n and k increases in such a way that k/n is kept fixed both the bias and standard deviation of \(\hat{\theta }_{\mathrm{ML}}\) decreases. Moreover, from Tables 1 and 3 we see that for the same values of n the bias and standard deviation are smaller when \(n-k+1\) is larger. This is so because the single experiment terminates at the moment of the \((n-k+1)\)th component failure and larger \(n-k+1\) allows to collect more information and thus to obtain a better precision of estimation.

In Table 2, we see a surprising situation. For the same values of \(n-k+1\) we obtain larger bias when n is larger. This may not agree with our intuition—larger n means that more elements are involved in a single experiment and thus we may expect a better estimation. Yet this is not the case. The reason is that if \(k=k(n)=[(1-q)n]\), where \(q\in (0,1)\) is such that the qth quantile of \(F(\theta ,\cdot )\) is not unique, then due to (38) the behavior of \(T_{n-k+1{:}\,n}\) is unstable causing worse behavior of \(\hat{\theta }_{\mathrm{ML}}\). Note that the qth quantile of the binomial \(b(w=4,\theta =0.5)\) distribution is not unique when \(1-q=11/16\) and is so when \(1-q=2/3\). Therefore, biases presented in the left-hand side of Table 2 are greater than the corresponding ones given in the right-hand side of this table.

6 Illustrative example

The following are times until breakdown in days of air monitors operated at a nuclear power plant: \( T_{1}(\omega )=8,\,T_{2}(\omega )=26,\,T_{3}(\omega )=10,\,T_{4}(\omega )=8,\, T_{5}(\omega )=29,\,T_{5}(\omega )=20,\,T_{7}(\omega )=10, \) for fixed \(\omega \in \varOmega \); see Bickel and Doksum (1977, p. 189). Assuming that the sample is from a Poisson \(\mathrm{Poiss}(\theta )\) population and considering 3 scenarios we will find MLE’s of \(\theta \).

  1. 1.

    For an uncensored sample it is well known that the MLE of \(\theta \) is equal to the mean. Therefore, based on the whole sample we obtain \(\hat{\theta }^{(1)}_{\mathrm{ML}}=15.86\).

  2. 2.

    Now suppose that we terminate the experiment at the moment of the \(r=5\)th failure, that is after 20 days. Then we have exactly \(s=5\) air monitors broken. Using the inference based on \(S(\omega )=5,T_{1{:}\,7}(\omega )=8,T_{2{:}\,7}(\omega )=8,T_{3{:}\,7}(\omega )=10,T_{4{:}\,7}(\omega )=10,T_{5{:}\,7}(\omega )=20\), we get \(\hat{\theta }^{(2)}_{\mathrm{ML}}=12.80\).

  3. 3.

    Finally, let us consider censoring by terminating the experiment at the moment of the \(r=3\)th failure, that is after 10 days. Then we observe \(s=4\) air monitors breakdowns. Thus, we collect the following data: \(S(\omega )=4,T_{1{:}\,7}(\omega )=8,T_{2{:}\,7}(\omega )=8,T_{3{:}\,7}(\omega )=10,T_{4{:}\,7}(\omega )=10\). The MLE based on this data is equal to \(\hat{\theta }_{\mathrm{ML}}^{(3)}=10.87\).

We see that the value of the MLE changes significantly when we change the censoring scenario. This unpleasurable feature is due to the fact that \(n=7\) is very small. Apparently, to obtain more reliable estimates we need to conduct an experiment with a larger number of air monitors.

7 Conclusions

In this paper, we have focused on maximum likelihood inference of the discrete lifetime distribution of components of a k-out-of-n system in the case when failure times of the components observed up to and including the moment of the breakdown of the system are available. Another problem of interest is the inference in the case when a sample of lifetimes of k-out-of-n systems and numbers of broken components at the moment of the system failure is given. We are currently working on the latter problem and planning to report our findings in a forthcoming paper.

It is also worth pointing out that the new results we obtained for the discrete case are analogous to that known in the literature for the continuous case in the sense that in both the cases under some regularity conditions the MLE’s of interest exist almost surely for sufficiently large n and are strongly consistent. Yet, the regularity conditions for the two cases are different and in the proofs different techniques are needed.