1 Introduction

The prediction of future record values is a relevant topic nowadays. Several papers had studied and applied this topic to different problems (climate change, sports, etc.). In several practical situations, the usual point predictions obtained with linear regression do not provide good record predictions. The basic tools in this field can be seen in Dunsmore (1983), Raqab and Nagaraja (1995), Awad and Raqab (2000) and Nevzorov (2001) and the recent advances in Paul and Thomas (2016), Guo et al. (2020), Volovskiy and Kamps (2020a, 2020b). Specific results for records from exponential distributions can be found in Awad and Raqab (2000), Basak and Balakrishnan (2003), Raqab (2007), Volovskiy and Kamps (2020a, 2020b) and the references therein. Analogously, the prediction of record values from a Pareto model were studied in Raqab et al. (2007), Volovskiy (2018) and Volovskiy and Kamps (2020b). Moreover, the prediction tools for generalized order statistics can also be applied to record values (see, Burkschat 2009).

The conditional median (or quantile) regression curves are a good alternative to the classical (mean) regression tool. The theoretical quantile regression curve to predict a response variable Y from an explanatory variable X can be obtained from the copula of (XY) (see, e.g., p. 217 in, Nelsen 2006). In practice, we can estimate the unknown parameters (in the copula or in the marginal distributions) or to use the empirical linear or non-linear estimators proposed in Koenker and Bassett (1978) and Koenker (2005). The package quantreg for the statistical program R can help us in this purpose. Moreover, the quantile regression curves can also be used to obtain confidence bands for the predictions and bivariate box plots, see Koenker and Bassett (1978) and Navarro (2020).

The aim of the paper is to use conditional median (or quantile) regression curves to predict future record values. Instead of use the general tools, we develop a specific procedure for upper record values. This procedure is based on the concept of multivariate distorted distributions introduced recently in Navarro et al. (2020). It is an alternative method to the classic copula approach to represent the dependence structure. A similar method can be used for lower record values (see Sect. 5). This is a novel approach that simplify the expressions for the quantile curves.

The rest of the paper is scheduled as follows. The main results are given in Sect. 2. These general results are applied to three models in Sect. 3. A fixed model. We consider a uniform distribution but a similar method holds for other fixed distribution functions. A parametric model, the popular proportional hazard rate (PHR) model with applications to exponential and Pareto models. And finally, we also consider a non-parametric approach. A (classical) real data set and a simulated data set of lifetimes in a specific sampling procedure in reliability are studied in Sects. 4 and 5, respectively. The conclusions are placed in Sect. 6.

If f is a real valued function with n variables, then \(\partial _i f\) denotes the partial derivative of f with respect to its i-th variable. Analogously, \(\partial _{i,j} f=\partial _i \partial _j f\) and so on. Whenever we use these partial derivatives, we are tacitly assuming that they exist.

2 Main results

Let \((X_n)_n\) be an infinite sequence of independent and identically distributed (IID) random variables with an absolutely continuous cumulative distribution function F, survival (reliability) function \({\bar{F}}=1-F\) and probability density function (PDF) \(f=F'\).

An observation \(X_j\) is called an upper record value in this sequence if it is greater than all previously observed values \(X_1,\dots ,X_{j-1}\). More specifically, the upper record times are defined as \(T(1)=1\) and

$$\begin{aligned} T(n+1)=\min \{ j: X_j>X_{T(n)}\} \end{aligned}$$

for \(n=1,2,\dots \). Then the sequence \((R_n)_n\) of upper record values is defined by \(R_n=X_{T(n)}\) for \(n=1,2,\dots \). The lower record values \((R^*_n)_n\) are defined in a similar way. The purpose of the paper is to predict \(R_{s}\) from \(R_1,\dots ,R_n\) (or just from a single \(R_i\) with \(i\le n\)) for \(s>n\).

It is well known (see, e.g., Nevzorov 2001, p. 65) that the survival function \({\bar{G}}_n(t)=\Pr (R_n>t)\) of the n-th upper record value is given by

$$\begin{aligned} {\bar{G}}_n(t)={\bar{F}}(t)\sum _{k=0}^{n-1}\frac{(-\log (\bar{F}(t)))^k}{k!}={\bar{q}}_n({\bar{F}}(t)), \end{aligned}$$
(1)

where

$$\begin{aligned} {\bar{q}}_n(u)=u\sum _{k=0}^{n-1}\frac{(-\log (u))^k}{k!} \end{aligned}$$

for \(u\in [0,1]\) and \(n=1,2,\dots \). The function \({\bar{q}}_n:[0,1]\rightarrow [0,1]\) is continuous, increasing and satisfies \({\bar{q}}_n(0)=0\) and \({\bar{q}}_n(1)=1\). It is called dual distortion function, see Hürlimann (2004). A similar representation holds for the distribution function \(G_n\) of \(R_n\) that can be written as

$$\begin{aligned} G_n(t)=q_n(F(t)), \end{aligned}$$

where \(q_n(u)=1-{\bar{q}}_n(1-u)\) for \(u\in [0,1]\). The function \(q_n\) is called the distortion function and \(G_n\) is a distorted distribution from F. The distorted distributions were introduced in Yaari (1987) in the context of the theory of choice under risk. Some applications and ordering results can be seen in Wang (1996) and Navarro et al. (2013).

Then, the PDF \(g_n\) of \(R_n\) can be obtained as

$$\begin{aligned} g_n(t)=f(t)\ {\bar{q}}^\prime _n({\bar{F}}(t))=\frac{1}{(n-1)!}f(t)(-\log ({\bar{F}}(t)))^{n-1} \end{aligned}$$
(2)

(see, e.g., Awad and Raqab 2000). Even more, the joint PDF \({\mathbf {g}}\) of \((R_1,\dots ,R_n)\) is given by

$$\begin{aligned} {\mathbf {g}}(x_1,\dots ,x_n)=h(x_1)\dots h(x_n){\bar{F}}(x_n) \end{aligned}$$
(3)

for \(x_1<\dots <x_n\) (see, Nevzorov 2001, p. 68), where \(h=f/{\bar{F}}\) is the hazard (or failure) rate function associated to F. Hence, from (3), we obtain the following result.

Proposition 1

The joint survival function \({\bar{\mathbf {G}}}\) of \((R_1,\dots ,R_n)\) can be written as

$$\begin{aligned} {{\bar{\mathbf {G}}}}(x_1,\dots ,x_n)={\hat{D}}({\bar{F}}(x_1),\dots , {\bar{F}}(x_n)) \end{aligned}$$
(4)

for a continuous increasing function \({\hat{D}}:[0,1]^n\rightarrow [0,1]\) which does not depend on F and satisfies \({\hat{D}}(0,\dots ,0)=0\) and \({\hat{D}}(1,\dots ,1)=1\).

Proof

As F is continuous, \({\bar{G}}_1, \dots , {\bar{G}}_n\) are continuous and, from Sklar’s theorem (see, e.g., Nelsen 2006, p. 46), the joint survival function \({\bar{\mathbf {G}}}\) of \((R_1,\dots ,R_n)\) can be written as

$$\begin{aligned} {{\bar{\mathbf {G}}}}(x_1,\dots ,x_n)={\hat{C}}({\bar{G}}_1(x_1), \dots , {\bar{G}}_n(x_n)) \end{aligned}$$

for all \(x_1,\dots ,x_n\) and a unique (survival) copula function \({\hat{C}}\). Hence, from (1), (4) holds for the increasing and continuous function

$$\begin{aligned} {\hat{D}}(u_1,\dots ,u_n)={\hat{C}}({\bar{q}}_1(u_1),\dots ,{\bar{q}}_n(u_n)) \end{aligned}$$

defined for \(u_1,\dots ,u_n\in [0,1]\) that satisfies \({\hat{D}}(0,\dots ,0)=0\) and \({\hat{D}}(1,\dots ,1)=1\).

\(\square \)

A similar representation can be obtained for the joint distribution function \( {\mathbf {G}}\), that is,

$$\begin{aligned} {\mathbf {G}}(x_1,\dots ,x_n)=D(F(x_1),\dots , F(x_n)) \end{aligned}$$

for a continuous increasing function \(D:[0,1]^n\rightarrow [0,1]\) which does not depend on F. Note that D can be computed from \({\hat{D}}\) (and vice versa). We must say that the explicit expression of \({\hat{D}}\) (or D) is complicated. However, the expression for \({\hat{C}}\) is much more complicated (some examples are given below).

These representations are similar to the classic copula representations (see, e.g., Nelsen 2006) but here D and \({\hat{D}}\) are not necessarily copulas and F is not the marginal distribution of \(R_i\) for \(i=2,\dots ,n\). Note that, as in (1), the joint survival \({\bar{\mathbf {G}}}\) is written in (4) as a distortion of the univariate survival function \({\bar{F}}\) at different points.

This kind of representations were called multivariate distorted distributions in Navarro et al. (2020). It can be proved that D and \({\hat{D}}\) can be extended to \({\mathbb {R}}^n\) and then they are multivariate distribution functions with supports included in \([0,1]^n\). Analogously, the joint PDF of \((R_1,\dots ,R_n)\) can be written as follows.

Proposition 2

The joint PDF \({\mathbf {g}}\) of \((R_1,\dots ,R_n)\) can be written as

$$\begin{aligned} {\mathbf {g}}(x_1,\dots ,x_n)=f(x_1)\dots f(x_n){\hat{d}}({\bar{F}}(x_1),\dots , {\bar{F}}(x_n)), \end{aligned}$$
(5)

where \({\hat{d}}=\partial _{1,\dots ,n}{\hat{D}}\) is the PDF of \({\hat{D}}\) given by

$$\begin{aligned} {\hat{d}}(u_1,\dots ,u_n)=\frac{1}{u_1\dots u_{n-1}} \end{aligned}$$
(6)

for \(1>u_1>\dots>u_n>0\) (zero elsewhere).

Proof

The expression of the PDF \({\mathbf {g}}\) in (5) can be obtained by differentiating in (4). Then the explicit expression for \({\hat{d}}\) can be obtained from (3) and (5). \(\square \)

Therefore, the explicit expression for \({\hat{D}}\) can be obtained from the expression for \({\hat{d}}\) given in (6). Moreover, the different marginal distributions of \((R_1,\dots ,R_n)\) have also multivariate distorted distributions (see, Navarro et al. 2020). For example, if \(1\le i< j\le n\), then the joint survival function \({\bar{\mathbf {G}}}_{i,j}\) of \((R_i,R_j)\) can be written from (4) as

$$\begin{aligned} {\bar{\mathbf {G}}}_{i,j}(x_i,x_j)={\hat{D}}_{i,j}({\bar{F}}(x_i), {\bar{F}}(x_j)), \end{aligned}$$
(7)

where \({\hat{D}}_{i,j}(u,v)={\hat{D}}(1,\dots ,1,u,1,\dots ,1,v,1,\dots ,1)\) and u and v are placed at the i-th and j-th variables, respectively.

Hence, to predict \(R_j\) from \(R_i\) for \(i<j\), we can use here a similar technique to that developed in copula theory to compute the median regression curve. The result can be stated as follows.

Proposition 3

The conditional survival function \({\bar{\mathbf {G}}}_{j|i}\) of \((R_j|R_i=x_i)\) for \(1\le i< j\le n\) is given by

$$\begin{aligned} {\bar{\mathbf {G}}}_{j|i}(x_j|x_i)=\frac{(i-1)!}{(-\log {\bar{F}}(x_i))^{i-1}}\ \partial _{1}{\hat{D}}_{i,j}({\bar{F}}(x_i), {\bar{F}}(x_j)) \end{aligned}$$
(8)

for \(x_j\ge x_i\) whenever \(f(x_i)>0\), \(0<{\bar{F}}(x_i)<1\) and \(\lim _{v\rightarrow 0^+}\partial _{1}{\hat{D}}_{i,j}(u,v)=0\) for all \(0<u<1\).

Proof

From (7), the joint PDF of \((R_i, R_j)\) is

$$\begin{aligned} {\mathbf {g}}_{i,j}(x_i,x_j)= \partial _{i,j}{{\bar{\mathbf {G}}}}_{i,j}(x_i,x_j)=f(x_i)f(x_j)\ \partial _{1,2}{\hat{D}}_{i,j}({\bar{F}}(x_i), {\bar{F}}(x_j)). \end{aligned}$$

Therefore, from (2), the conditional PDF of \((R_j|R_i=x_i)\) is

$$\begin{aligned} {\mathbf {g}}_{j|i}(x_j|x_i)=\frac{{\mathbf {g}}_{i,j}(x_i,x_j)}{{\mathbf {g}}_{i}(x_i)}=\frac{(i-1)!}{(-\log {\bar{F}}(x_i))^{i-1}}f(x_j)\ \partial _{1,2}{\hat{D}}_{i,j}({\bar{F}}(x_i), {\bar{F}}(x_j)), \end{aligned}$$

for \(x_j>x_i\) and \(x_i\) such that \(f(x_i)>0\). Hence, its survival function is

$$\begin{aligned} {\bar{\mathbf {G}}}_{j|i}(x_j|x_i)&=\frac{(i-1)!}{(-\log {\bar{F}}(x_i))^{i-1}}\int _{x_j}^\infty {f(z)\partial _{1,2}{\hat{D}}_{i,j}({\bar{F}}(x_i), {\bar{F}}(z))}dz\\&= \frac{(i-1)!}{(-\log {\bar{F}}(x_i))^{i-1}}\left[ -\partial _{1}{\hat{D}}_{i,j}({\bar{F}}(x_i), {\bar{F}}(z))\right] _{z=x_j}^\infty \\&= \frac{(i-1)!}{(-\log {\bar{F}}(x_i))^{i-1}}\ \partial _{1}{\hat{D}}_{i,j}({\bar{F}}(x_i), {\bar{F}}(x_j)) \end{aligned}$$

where, in the last equality, we use that \(\lim _{v\rightarrow 0^+}\partial _{1}{\hat{D}}_{i,j}(u,v)=0\) for all \(0<u<1\) and that \(0<{\bar{F}}(x_i)<1\). \(\square \)

As a consequence, the median regression curve to predict \(R_j\) from \(R_i\) is

$$\begin{aligned} m_{j|i}(x_i)={\bar{\mathbf {G}}}^{-1}_{j|i}(0.5|x_i), \end{aligned}$$
(9)

where \({\bar{\mathbf {G}}}^{-1}_{j|i}\) is the inverse function of \({\bar{\mathbf {G}}}_{j|i}\). This procedure can also be used to determine quantile confidence bands for this prediction. For example the centered \(50\%\) quantile confidence band is given by

$$\begin{aligned} \left[ {\bar{\mathbf {G}}}^{-1}_{j|i}(0.75|x_i), {\bar{\mathbf {G}}}^{-1}_{j|i}(0.25|x_i)\right] \end{aligned}$$

and the centered \(90\%\) quantile confidence band by

$$\begin{aligned} \left[ {\bar{\mathbf {G}}}^{-1}_{j|i}(0.95|x_i), {\bar{\mathbf {G}}}^{-1}_{j|i}(0.05|x_i)\right] . \end{aligned}$$

To get these predictions from (8) we need to compute \({\hat{D}}\) (or \({\hat{D}}_{i,j}\)) and to estimate \({\bar{F}}\). In the following subsections, we consider different relevant cases.

2.1 Case \(i=1\) and \(j=2\)

A straightforward calculation from (3) proves that the joint survival function \({\bar{\mathbf {G}}}_{1,2}\) of \((R_1,R_2)\) can be written as

$$\begin{aligned} {{\bar{\mathbf {G}}}}_{1,2}(x_1,x_2)={\bar{F}}(x_2)+{\bar{F}}(x_2)\log \frac{{\bar{F}}(x_1)}{{\bar{F}}(x_1)}={\hat{D}}_{1,2}({\bar{F}}(x_1), {\bar{F}}(x_2)) \end{aligned}$$
(10)

for \(x_1\le x_2\), where

$$\begin{aligned} {\hat{D}}_{1,2}(u,v)=v+v\log \frac{u}{v} \end{aligned}$$

for \(1> u\ge v >0\). Note that \({\hat{D}}_{1,2}\) is not a copula since \({\hat{D}}_{1,2}(1,v)=v-v\log v\ne v\). Actually \({\hat{D}}_{1,2}(1,v)={\bar{q}}_2(v)\), that is, it is the dual distortion function of the second upper record given in (1). Also note that \({\bar{F}}\) is equal to the first marginal survival distribution but it is not equal to the second one (so (10) is not a copula representation). To get the copula (or the survival copula) representation of \((R_1,R_2)\) we need the explicit expression of the inverse function of \(G_2\).

The distortion function \({\hat{D}}_{1,2}\) can also be obtained from Proposition 2 as

$$\begin{aligned} {\hat{D}}_{1,2}(u,v)=\int _{0}^v\int _{y}^u \frac{1}{x} dxdy=\int _{0}^v (\log u-\log y) dy=v+v\log \frac{u}{v} \end{aligned}$$

for \(1> u\ge v >0\). Hence \(\partial _1 {\hat{D}}_{1,2}(u,v)= v /u\) for \(1> u\ge v >0\), and, from (8), we get

$$\begin{aligned} {{\bar{\mathbf {G}}}}_{2|1}(x_2|x_1)=\frac{\partial _{1}{\hat{D}}_{1,2}({\bar{F}}(x_1), {\bar{F}}(x_2))}{(- \log {\bar{F}}(x_1))^{0}}=\frac{{\bar{F}}(x_2)}{{\bar{F}}(x_1)} \end{aligned}$$

for \(x_2\ge x_1\) and \(x_1\) such that \(0<{\bar{F}}(x_1)<1\) and \(f(x_1)>0\). This a very well known result (see, e.g., Nevzorov 2001, p. 68). Then, the median regression curve to predict \(R_2\) from \(R_1\) is

$$\begin{aligned} m_{2|1}(x_1)={{\bar{\mathbf {G}}}}^{-1}_{2|1}(0.5|x_1)={\bar{F}}^{-1}(0.5{\bar{F}}(x_1)), \end{aligned}$$
(11)

where \({\bar{F}}^{-1}\) is the inverse function of \({\bar{F}}\). Analogously, the centered \(50\%\) and \(90\%\) quantile confidence bands for this prediction are given by

$$\begin{aligned} \left[ {\bar{\mathbf {G}}}^{-1}_{2|1}(0.75|x_1), {\bar{\mathbf {G}}}^{-1}_{2|1}(0.25|x_1)\right] =\left[ {\bar{F}}^{-1}(0.75{\bar{F}}(x_1)),{\bar{F}}^{-1}(0.25{\bar{F}}(x_1))\right] \end{aligned}$$

and

$$\begin{aligned} \left[ {\bar{\mathbf {G}}}^{-1}_{2|1}(0.95|x_1), {\bar{\mathbf {G}}}^{-1}_{2|1}(0.05|x_1)\right] =\left[ {\bar{F}}^{-1}(0.95{\bar{F}}(x_1)),{\bar{F}}^{-1}(0.05{\bar{F}}(x_1)) \right] . \end{aligned}$$

They are plotted in Sect. 3 for different survival functions \({\bar{F}}\).

2.2 Case \(i=n\) and \(j=n+1\)

First, let us see how the joint survival function \({\bar{\mathbf {G}}}_{n,n+1}\) of \((R_n,R_{n+1})\) can be written as a distortion from the univariate survival function \({\bar{F}}\).

Proposition 4

The joint survival function \({\bar{\mathbf {G}}}_{n,n+1}\) of \((R_n,R_{n+1})\) can be written as

$$\begin{aligned} {\bar{\mathbf {G}}}_{n,n+1}(x_n,x_{n+1})={\hat{D}}_{n,n+1}({\bar{F}}(x_n), {\bar{F}}(x_{n+1})) \end{aligned}$$

for \(x_n\le x_{n+1}\), where

$$\begin{aligned} {\hat{D}}_{n,n+1}(u,v)=-\frac{1}{n!} v (-\log u)^n+{\bar{\gamma }}_{n+1}(-\log v) \end{aligned}$$
(12)

for \(1>u\ge v >0\) and

$$\begin{aligned} {\bar{\gamma }}_{n+1}(z)=\frac{1}{n!} \int _z^\infty x^{n}e^{-x}dx \end{aligned}$$
(13)

is the survival function of a gamma distribution with scale parameter equal to one and shape parameter equal to \(n+1\).

Proof

From Proposition 2 and (7), we know that \({\hat{D}}_{n,n+1}\) is a marginal distribution of the PDF given by

$$\begin{aligned} {\hat{d}}(u_1,\dots ,u_{n+1})=\frac{1}{u_1 \dots u_n} \end{aligned}$$

for \(1>u_1>\dots>u_{n+1}>0\) and zero elsewhere. Therefore, for \(1>u>v>0\), we have

$$\begin{aligned} {\hat{D}}_{n,n+1}(u,v)&=\int _0^v\int _{u_{n+1}}^u\int _{u_n}^1\dots \int _{u_2}^1 \frac{1}{u_1 \dots u_n} du_1\dots du_{n+1}\\&=\int _0^v\int _{u_{n+1}}^u\int _{u_n}^1\dots \int _{u_3}^1 (-\log u_2)\frac{1}{u_2 \dots u_n} du_2\dots du_{n+1}\\&=\int _0^v\int _{u_{n+1}}^u\int _{u_n}^1\dots \int _{u_4}^1 \frac{1}{2}(-\log u_3)^2\frac{1}{u_3 \dots u_n} du_3\dots du_{n+1}\\&=\dots \end{aligned}$$

and so on. Thus we finally arrive to

$$\begin{aligned} {\hat{D}}_{n,n+1}(u,v)&=\int _0^v\int _{u_{n+1}}^u \frac{1}{(n-1)!} (-\log u_n)^{n-1}\frac{1}{u_n} du_ndu_{n+1}\\&=\frac{1}{n!}\int _0^v \left[ (-\log u_{n+1})^{n}-(-\log u)^{n}\right] du_{n+1}\\&=-\frac{1}{n!}v(-\log u)^{n}+\frac{1}{n!}\int _0^v (-\log u_{n+1})^{n} du_{n+1} \end{aligned}$$

and, by doing the change \(z=-\log u_{n+1}\), we obtain expression (12). \(\square \)

Therefore, from (12), we get

$$\begin{aligned} \partial _1 {\hat{D}}_{n,n+1}(u,v)=\frac{1}{(n-1)!} (-\log u)^{n-1}\frac{v}{u} \end{aligned}$$

for \(1>u\ge v >0\) and from (8) the conditional survival function is

$$\begin{aligned} {\bar{\mathbf {G}}}_{n+1|n}(x_{n+1}|x_n)=(n-1)! \frac{\partial _{1}{\hat{D}}_{n,n+1}({\bar{F}}(x_n), {\bar{F}}(x_{n+1}))}{(-\log {\bar{F}}(x_n))^{n-1}}=\frac{{\bar{F}}(x_{n+1})}{{\bar{F}}(x_n)} \end{aligned}$$

for \(x_{n+1}\ge x_n\) and \(x_n\) such that \({\bar{F}}(x_n)>0\) and \(f(x_n)>0\). This expression is also a very well known result (see, e.g., Nevzorov 2001, p. 68). Even more, we can see that the record values form a Markov chain, that is,

$$\begin{aligned} \Pr (R_{n+1}>x_{n+1}|R_1=x_1,\dots ,R_n=x_n)&=\Pr (R_{n+1}>x_{n+1}|R_n=x_n)\\ {}&= {\bar{\mathbf {G}}}_{n+1|n}(x_{n+1}|x_n). \end{aligned}$$

Therefore the median regression curve to predict \(R_{n+1}\) from \(R_n\) (or from \(R_1,\dots ,R_n\)) is the same as that obtained in the preceding subsection (see (11)) and so are the confidence regions. This is a good point since, if F is known, then the median regression curve and the confidence regions do not depend on n and the sequence of paired records \((R_n,R_{n+1})\) for \(n=1,2,\dots \) can be plotted in the same graph (since they have the same conditional distribution). This is not the case if we need to estimate F or a parameter in F (see the different examples in Sect. 3).

2.3 Case \(i=m\) and \(j=n\)

In the general case the joint survival function \({\bar{\mathbf {G}}}_{m,n}\) of \((R_m,R_{n})\) for \(1\le m<n\) can also be written as a distortion from \({\bar{F}}\). The result is stated in the following proposition.

Proposition 5

The joint survival function \({\bar{\mathbf {G}}}_{m,n}\) of \((R_m,R_{n})\) for \(1\le m<n\) can be written as

$$\begin{aligned} {\bar{\mathbf {G}}}_{m,n}(x_m,x_n)={\hat{D}}_{m,n}({\bar{F}}(x_m), {\bar{F}}(x_{n})) \end{aligned}$$

for \(x_n\le x_{n+1}\), where

$$\begin{aligned} {\hat{D}}_{m,n}(u,v)={\bar{\gamma }}_m(-\log v)+\frac{1}{(m-1)!}\int _{-\log u}^{-\log v} z^{m-1}e^{-z} {\bar{\gamma }}_{n-m}(-\log v-z) dz\nonumber \\ \end{aligned}$$
(14)

for \(1>u\ge v >0\) and \({\bar{\gamma }}_k\) is the survival function in (13).

Proof

From (3) (or from (3.2) in, Awad and Raqab 2000) the joint PDF of \((R_m,R_n)\) is

$$\begin{aligned} {\mathbf {g}}_{m,n}(x_m,x_n)=\frac{h(x_m)f(x_n)}{(m-1)!(n-m-1)!}H^{m-1}(x_m)\left[ H(x_n)-H(x_m) \right] ^{n-m-1} \end{aligned}$$

for \(x_m<x_n\) (zero elsewhere), where \(H(x)=-\log {\bar{F}}(x)\) and \(h(x)=H'(x)\) (the hazard rate function). Hence its survival function is

$$\begin{aligned}&{\bar{\mathbf {G}}}_{m,n}(x_m,x_n)\\&\quad =\int _{x_m}^\infty \int _{x_n}^\infty {\mathbf {g}}_{m,n}(y_m,y_n)dy_ndy_m\\&\quad =\int _{x_m}^{x_n} \int _{x_n}^\infty {\mathbf {g}}_{m,n}(y_m,y_n)dy_ndy_m +\int _{x_n}^{\infty } \int _{y_m}^\infty {\mathbf {g}}_{m,n}(y_m,y_n)dy_ndy_m\\&\quad =\int _{x_m}^{x_n} \int _{x_n}^\infty \frac{h(y_m)f(y_n)H^{m-1}(y_m)}{(m-1)!(n-m-1)!}\left[ H(y_n)-H(y_m) \right] ^{n-m-1} dy_ndy_m\\&\quad \quad +\int _{x_n}^{\infty } \int _{y_m}^\infty \frac{h(y_m)f(y_n)H^{m-1}(y_m)}{(m-1)!(n-m-1)!}\left[ H(y_n)-H(y_m) \right] ^{n-m-1} dy_ndy_m. \end{aligned}$$

Now we do the change \(x=H(y_n)-H(y_m)\), with \(dx=h(y_n)dy_n\) and \({\bar{F}}(y_m) e^{-x}={\bar{F}}(y_n)\). Hence

$$\begin{aligned}&{\bar{\mathbf {G}}}_{m,n}(x_m,x_n)\\&\quad =\int _{x_m}^{x_n} \int _{H(x_n)-H(y_m)}^\infty \frac{h(y_m) {\bar{F}}(y_m) x^{n-m-1}e^{-x}}{(m-1)!(n-m-1)!}H^{m-1}(y_m) dx dy_m\\&\quad \quad +\int _{x_n}^{\infty } \int _{0}^\infty \frac{h(y_m) {\bar{F}}(y_m) x^{n-m-1}e^{-x}}{(m-1)!(n-m-1)!}H^{m-1}(y_m) dx dy_m\\&\quad =\int _{x_m}^{x_n} \frac{h(y_m) {\bar{F}}(y_m) H^{m-1}(y_m)}{(m-1)!}{\bar{\gamma }}_{n-m}(H(x_n)-H(y_m)) dy_m \\&\quad \quad +\int _{x_n}^{\infty } \frac{h(y_m) {\bar{F}}(y_m) H^{m-1}(y_m)}{(m-1)!} dy_m \end{aligned}$$

and by doing the change \(z=H(y_m)\), with \(dz=h(y_m)dy_m\) and \( e^{-z}={\bar{F}}(y_m)\), we get

$$\begin{aligned} {\bar{\mathbf {G}}}_{m,n}(x_m,x_n)&=\int _{H(x_m)}^{H(x_n)} \frac{z^{m-1}e^{-z}}{(m-1)!} {\bar{\gamma }}_{n-m}(H(x_n)-z) dz +\int _{H(x_n)}^{\infty }\frac{z^{m-1}e^{-z}}{(m-1)!} dz\\&={\bar{\gamma }}_m(H(x_n))+ \frac{1}{(m-1)!} \int _{H(x_m)}^{H(x_n)} z^{m-1} e^{-z} {\bar{\gamma }}_{n-m}(H(x_n)-z) dz \\&={\hat{D}}_{m,n} ({\bar{F}}(x_m),{\bar{F}}(x_n)) \end{aligned}$$

for \(x_m\le x_n\) as stated above. \(\square \)

Therefore, from (14), we obtain

$$\begin{aligned} \partial _1 {\hat{D}}_{m,n}(u,v)&=\frac{1}{u} \frac{(-\log u)^{m-1}}{(m-1)!} e^{\log (u)} {\bar{\gamma }}_{n-m}(-\log v+\log u)\\&= \frac{1}{(m-1)!} (-\log u)^{m-1} {\bar{\gamma }}_{n-m}\left( -\log \frac{v}{u}\right) \end{aligned}$$

for \(1>u>v>0\) and, from (8), the conditional survival function of \((R_n|R_m=x_m)\) is

$$\begin{aligned} {\bar{\mathbf {G}}}_{n|m}(x_n|x_m)=(m-1)! \frac{\partial _{1}{\hat{D}}_{m,n}({\bar{F}}(x_m), {\bar{F}}(x_n))}{(-\log {\bar{F}}(x_m))^{m-1}}= {\bar{\gamma }}_{n-m}\left( -\log \frac{{\bar{F}}(x_n)}{{\bar{F}}(x_m)}\right) \end{aligned}$$
(15)

for \(x_n\ge x_m\) since \(\lim _{v\rightarrow 0^+}\partial _{1}{\hat{D}}_{m,n}(u,v)=0\) for all \(0<u<1\). Note that (15) proves that \({\bar{\mathbf {G}}}_{n|m}\) is also a distorted distribution since

$$\begin{aligned} {\bar{\mathbf {G}}}_{n|m}(x_n|x_m)={\bar{q}}_{n|m}({\bar{F}}(x_n)|{\bar{F}}(x_m)) \end{aligned}$$

with \({\bar{q}}_{n|m}(v|{\bar{F}}(x_m))= {\bar{\gamma }}_{n-m}\left( -\log v+\log {\bar{F}}(x_m) \right) \) for \(0\le v\le {\bar{F}}(x_m)\).

Therefore, the median regression curve to predict \(R_n\) from \(R_m=x_m\) is given by

$$\begin{aligned} m_{n|m}(x_m)={\bar{\mathbf {G}}}^{-1}_{n|m}(0.5|x_m)={\bar{F}}^{-1} (c_{n-m}(0.5){\bar{F}}(x_m)), \end{aligned}$$
(16)

where \({\bar{F}}^{-1}\) is the inverse function of \({\bar{F}}\),

$$\begin{aligned} c_{k}(y)=\exp (-\gamma _{k}^{-1} (y)) \end{aligned}$$
(17)

and \(\gamma _{k}^{-1}\) is the quantile function of a gamma distribution with shape parameter k and scale parameter equal to one.

Analogously, the \(50\%\) and \(90\%\) quantile confidence bands for this prediction are given by

$$\begin{aligned} \left[ {\bar{F}}^{-1}(c_{n-m}(0.25){\bar{F}}(x_m)), {\bar{F}}^{-1}(c_{n-m}(0.75){\bar{F}}(x_m))\right] \end{aligned}$$

and

$$\begin{aligned} \left[ {\bar{F}}^{-1}(c_{n-m}(0.05){\bar{F}}(x_m)),{\bar{F}}^{-1}(c_{n-m}(0.95){\bar{F}}(x_m))\right] . \end{aligned}$$

Note that they only depend on \({\bar{F}}\) and on \(k=n-m\). Hence, if \({\bar{F}}\) is known, then we have common confidence bands for the sequence of paired records \((R_1,R_{1+k}), (R_2,R_{2+k}),\dots \) They are plotted in Sect. 3 for different survival functions \({\bar{F}}\). This is not the case if we estimate \({\bar{F}}\) (or a parameter in \({\bar{F}}\)). Of course, note that these general formulas can be used to obtain the particular formulas included in the preceding subsections.

3 Examples

We consider here some classical models to show how to proceed when F is known or when \(F_\theta \) is known but it contains an unknown parameter \(\theta \). We also study the case in which F is completely unknown.

3.1 Uniform distribution

As we have mentioned before, if F is known, then the predictions and the confidence bands do not depend on n. Hence we can plot the paired records, the predictions for new records and the confidence bands in the same figure. Let us see some examples.

The simplest case is a standard uniform distribution with \(F(x)=x\) for \(0\le x \le 1\). Hence, from (11), the median regression curve to predict \(R_{n+1}\) from \(R_n=x_n\) (or from \(R_1=x_1,\dots ,R_n=x_n\)) is

$$\begin{aligned} m_{n+1|n}(x_n)={\bar{F}}^{-1}(0.5{\bar{F}}(x_n))=1-0.5(1-x_n)=0.5+0.5x_n \end{aligned}$$

since \({\bar{F}}^{-1}(x)=1-x\) is the inverse function of \({\bar{F}}(x)=1-x\). The classical mean regression curve can be obtained as

$$\begin{aligned} \tilde{m}_{n+1|n}(x_n)=x_n+\int _{x_n}^1 \frac{1-x}{1-x_n}dx=0.5+0.5x_n, \end{aligned}$$

that is, in this case, both curves coincide.

Analogously, the \(50\%\) and \(90\%\) confidence bands for the predictions are

$$\begin{aligned} \left[ 0.25+0.75x_n,0.75+0.25x_n\right] \text { and } \left[ 0.05+0.95x_n,0.95+0.05x_n\right] . \end{aligned}$$

They are plotted in Fig. 1 (left) jointly with \(m_{n+1|n}\) and the sequence of paired records \((R_1,R_2),\dots ,(R_8,R_9)\). The sequence of the first ninth records obtained by simulation is

$$\begin{aligned} 0.31916, 0.78427, 0.87288, 0.90177, 0.95036, 0.98365, 0.98411, 0.99982, 0.99996. \end{aligned}$$
Fig. 1
figure 1

Plots of the paired records \((R_n,R_{n+1})\) (left) and \((R_n,R_{n+2})\) (right) from a standard uniform distribution jointly with the median regression curve (black) and the limit for the \(50\%\) (blue) and \(90\%\) (red) confidence bands

The predicted values \(m_{n+1|n}(R_{n})\) and the confidence intervals \([l_n,u_n]\) (\(50\%\)) and \([L_n,U_n]\) (\(90\%\)) for these records are given in Table 1. Note that 5-out-of-8 points belong to the \(50\%\) confidence interval (a little bit above the expected coverage) and just 7-out-of-8 belong to the \(90\%\) confidence interval (a little bit below the expected coverage). We can compare the prediction for \(R_9\) with that obtained from the (sample) regression line \(m(x)=0.6688+0.3097x\) based on the pairs \((R_1,R_2),\dots ,(R_7,R_8)\). This prediction is \(m(R_8)=0.6688+0.3097\cdot 0.999815=0.9784427\). The prediction in Table 1 is \(m_{n+1|n}( R_8)= 0.9999075\) (median regression curve) and the exact value is \(R_9=0.999959\). The respective absolute errors are 0.02151629 (estimated regression line) and 0.0000515 (exact median regression curve). As expected the second prediction is better (since it takes into account the information about the model).

Table 1 Predicted values and centered confidence intervals \([l_n,u_n]\) (\(50\%\)) and \([L_n,U_n]\) (\(90\%\)) for the first nine records from a standard uniform distribution

Analogously, from (16), the median regression curve to predict \(R_{n+k}\) from \(R_n=x_n\) (or from \(R_1=x_1,\dots ,R_n=x_n\)) is

$$\begin{aligned} m_{n+k|n}(x_n)= & {} {\bar{F}}^{-1}(c_k(0.5) {\bar{F}}(x_n))=1-c_k(0.5) (1-x_n)=1-c_k(0.5)\\&+c_k(0.5) x_n, \end{aligned}$$

where \(c_k(0.5)=\exp (-\gamma ^{-1}_k(0.5))\) (see (17)). The values of \(c_k(0.5)\) for \(k=1,2,3,4,5\) are

$$\begin{aligned} 0.5, 0.186682309, 0.068971610, 0.025424023, 0.009363755. \end{aligned}$$
(18)

The code in the statistical program R to get \(c_k(u)\) is: exp(-qgamma(u,k)).

In a similar way, the \(50\%\) and \(90\%\) confidence bands for the predictions are

$$\begin{aligned} \left[ 1-c_k(0.25)+c_k(0.25) x_n,1-c_k(0.75)+c_k(0.75) x_n\right] \end{aligned}$$

and

$$\begin{aligned} \left[ 1-c_k(0.05)+c_k(0.05) x_n,1-c_k(0.05)+c_k(0.95) x_n\right] . \end{aligned}$$

They are plotted in Fig. 1 (right) for \(k=2\) jointly with \(m_{n+2|n}\) and the sequence of the first seventh paired records \((R_1,R_3),\dots ,(R_7,R_9)\). The prediction obtained for the nineth record from the seventh one is

$$\begin{aligned} {\hat{R}}_9=m_{n+2|n}(R_7)= 1-0.186682309+ 0.186682309\cdot 0.984107= 0.9970331 \end{aligned}$$

with confidence intervals [0.9939225, 0.998924] (\(50\%\)) and [0.9888603, 0.9998617] (\(90\%\)). In this case the exact value \(R_9=0.999959\) is out of these intervals. The absolute error is 0.002925942. If we use the estimated regression line \(m(x)=0.8019+0.1832x\) obtained from \((R_1,R_3),\dots ,(R_6,R_8)\), we get the prediction \(m(R_7)=0.9821884\) and the absolute error 0.0177706.

The same method can be applied to other (known) distributions. Some examples are given in the next subsections. Note that the quantile functions of the most usual models are included in statistical programs as R.

3.2 PHR model

If the baseline distribution function \(F_{\theta }\) has a known parametric form with an unknown parameter \(\theta \) and \(R_1=x_1,\dots ,R_n=x_n\) are known, then we can use (3) to get the likelihood function for \(\theta \) as

$$\begin{aligned} \ell (\theta )=h_\theta (x_1)\dots h_\theta (x_n){\bar{F}}_\theta (x_n). \end{aligned}$$

This expression can be used to obtain the maximum likelihood estimator (MLE) \({\hat{\theta }}\) of \(\theta \). Numerical methods can be used to get this estimator.

Then, as in the preceding subsection, we can use \({\bar{F}}_{{\hat{\theta }}}\) to get the estimated median regression curve and the estimated confidence bands replacing \({\bar{F}}\) with \({\bar{F}}_{{\hat{\theta }}}\) in the theoretical results obtained in Sect. 2.

In this subsection we consider a very popular model in reliability and survival studies, the proportional hazard rate (PHR) Cox model. This model contains useful models such as exponential and Pareto distributions (studied in the following subsections). It is defined by a proportional hazard rate function \(h_\theta (x)=\theta h(x),\) where h is a known hazard rate function and \(\theta >0\) is an unknown parameter. Hence, its survival function is \({\bar{F}}_\theta (x)={\bar{F}}^\theta (x),\) where \({\bar{F}}\) is the survival function associated to h. Hence, the likelihood function is

$$\begin{aligned} \ell (\theta )=\theta ^n h(x_1)\dots h(x_n){\bar{F}}^\theta (x_n). \end{aligned}$$

To maximize \(\ell \) is equivalent to maximize

$$\begin{aligned} {\tilde{\ell }}(\theta )=\log \ell (\theta )=n\log \theta +\theta \log {\bar{F}}(x_{n})+\sum _{i=1}^n\log h(x_i) \end{aligned}$$

with

$$\begin{aligned} {\tilde{\ell }}^\prime (\theta )=\frac{n}{\theta }+\log {\bar{F}}(x_{n}). \end{aligned}$$

Then the MLE of \(\theta \) is

$$\begin{aligned} {\hat{\theta }}_n=-\frac{n}{\log {\bar{F}}(x_{n})}. \end{aligned}$$
(19)

Therefore, from (11), the exact median regression curve for the PHR model is

$$\begin{aligned} m_{n+1|n}(x_n)={\bar{F}}_\theta ^{-1}(0.5{\bar{F}}_\theta (x_n))={\bar{F}}^{-1}(0.5^{1/\theta }{\bar{F}}(x_n)) \end{aligned}$$

since \({\bar{F}}_{\theta }^{-1}(y)={\bar{F}}^{-1}(y^{1/\theta })\) is the inverse function of \({\bar{F}}_\theta \). If we replace \(\theta \) with the MLE \({\hat{\theta }}_n\) given in (19), we obtain the maximum likelihood estimated median regression curve (EMRC) as

$$\begin{aligned} {\hat{m}}_{n+1|n}(x_n)={\bar{F}}^{-1}(0.5^{1/{{\hat{\theta }}_n}}{\bar{F}}(x_n))={\bar{F}}^{-1}\left( {\bar{F}}(x_n)0.5^{-\frac{1}{n}\log {\bar{F}}(x_n)}\right) . \end{aligned}$$

Analogously, the \(50\%\) and \(90\%\) estimated quantile confidence bands (EQCB) are

$$\begin{aligned} \left[ {\bar{F}}^{-1}\left( {\bar{F}}(x_n)0.75^{-\frac{1}{n}\log {\bar{F}}(x_n)}\right) ,{\bar{F}}^{-1}\left( {\bar{F}}(x_n)0.25^{-\frac{1}{n}\log {\bar{F}}(x_n)}\right) \right] \end{aligned}$$

and

$$\begin{aligned} \left[ {\bar{F}}^{-1}\left( {\bar{F}}(x_n)0.95^{-\frac{1}{n}\log {\bar{F}}(x_n)}\right) ,{\bar{F}}^{-1}\left( {\bar{F}}(x_n)0.05^{-\frac{1}{n}\log {\bar{F}}(x_n)}\right) \right] , \end{aligned}$$

respectively.

In the general case, if we want to predict \(R_{n+k}\) from \(R_n\) for \(k>0\), from (16), the EMRC is

$$\begin{aligned} m_{n+k|n}(x_n)={\bar{F}}_{{\hat{\theta }}_n}^{-1}\left( c_{k}(0.5){\bar{F}}_{{\hat{\theta }}_n}(x_n)\right) ={\bar{F}}^{-1}\left( {\bar{F}}(x_n)(c_{k}(0.5))^{-\frac{1}{n}\log {\bar{F}}(x_n)}\right) , \end{aligned}$$
(20)

where \(c_k\) is defined in (17). The EQCB are obtained in a similar way by replacing 0.5 with 0.05, 0.25, 0.75, 0.95.

3.3 Exponential distribution

The exponential model with survival function \({\bar{F}}_\theta (x)=\exp (-\theta x)\) for \(x\ge 0\) satisfies the PHR model with \(h(x)=1\) and \({\bar{F}}(x)=\exp (-x)\). Then, from the results given in the preceding subsection, the MLE for \(\theta \) is \({\hat{\theta }}_n=n/x_n\). This is a well known result (see, e.g., Awad and Raqab 2000).

If \(\theta \) is known, the exact median regression curve to predict \(R_{n+1}\) from \(R_n=x_n\) (or from \(R_1=x_1,\dots ,R_n=x_n\)) is

$$\begin{aligned} m_{n+1|n}(x_n)={\bar{F}}^{-1}(0.5^{1/\theta }{\bar{F}}(x_n))=x_n-\log (0.5)\frac{1}{\theta }\end{aligned}$$

since \({\bar{F}}^{-1}(x)=-\log (x)\) is the inverse function of \({\bar{F}}(x)=\exp (- x)\).

If \(\theta \) is unknown, we replace \(\theta \) with the MLE \({\hat{\theta }}_n\), obtaining the maximum likelihood EMRC as

$$\begin{aligned} {\hat{m}}_{n+1|n}(x_n)=x_n-\log (0.5)\frac{1}{{\hat{\theta }}_n}=x_n-\frac{\log (0.5)}{n} x_n. \end{aligned}$$
(21)

The maximum likelihood estimated regression curve (ERC) \(\tilde{m}_{n+1|n}(x_n)=E_{{\hat{\theta }}}(R_{n+1}|R_n=x_n)\) can be obtained as

$$\begin{aligned} \tilde{m}_{n+1|n}(x_n)=x_n+\int _{x_n}^\infty \frac{\exp (-{\hat{\theta }} x)}{\exp (-{\hat{\theta }} x_n)}dx=x_n+\frac{1}{{\hat{\theta }}_n}=x_n+\frac{1}{n}x_n, \end{aligned}$$
(22)

that is, in this case, both curves (lines) do not coincide.

Another option is to use the joint MLP for \((\theta ,R_{n+1})\) from \(R_1=x_1,\dots ,\) \(R_n=x_n\). It is well known that this procedure leads to \({\hat{\theta }}=(n+1)/x_n\) and the trivial estimator \({\hat{R}}_{n+1}=x_n\).

As for the exponential distribution both curves \({\hat{m}}_{n+1|n}(x_n)\) and \(\tilde{m}_{n+1|n}(x_n)\) are straight lines, we could also use a (sample) linear mean (or quantile) regression to predict \(R_{n+1}\) from a sequence of paired records (as in Sect. 3.1).

From the general results for the PHR model, the \(50\%\) and \(90\%\) estimated quantile confidence bands (EQCB) for the exponential distribution are

$$\begin{aligned} \left[ x_n-\log (0.75)\frac{1}{{\hat{\theta }}_n},x_n-\log (0.25)\frac{1}{{\hat{\theta _n}}}\right] =\left[ x_n-\log (0.75)\frac{x_n}{n},x_n-\log (0.25)\frac{x_n}{n}\right] \end{aligned}$$

and

$$\begin{aligned} \left[ x_n-\log (0.95)\frac{1}{{\hat{\theta }}_n},x_n-\log (0.05)\frac{1}{{\hat{\theta _n}}}\right] =\left[ x_n-\log (0.95)\frac{x_n}{n},x_n-\log (0.05)\frac{x_n}{n}\right] , \end{aligned}$$

respectively. They were obtained previously in Awad and Raqab (2000). Note that they depend on n.

If \(\theta \) is known, the exact confidence bands are obtained just by replacing \({\hat{\theta }}_n\) with \(\theta \). In this case, they do not depend on n and we can plot together the regression curves, the confidence bands and the sequence of paired records. We do so in Fig. 2, left, for \(\theta =1\) including the exact regression curves and the sequence of the first eight paired records \((R_1,R_2),\dots ,(R_8,R_9)\). The simulated sequence of records \(R_1,\dots ,R_9\) obtained is

$$\begin{aligned} 0.45701, 1.19403, 1.74177, 2.00398, 2.25833, 4.97619, 5.84512, 6.90868, 14.16568. \end{aligned}$$

The purple line represents the sample regression line \(m(x)=-0.2967+1.6335x \) obtained from \((R_1,R_2),\dots ,(R_8,R_9)\) which a very bad approximation of both theoretical regression lines (it is strongly determined by the data and it does not take into account the information about the model).

Fig. 2
figure 2

Plot of the paired records \((R_n,R_{n+1})\) from a standard exponential distribution jointly with the median regression curve (black), the theoretical and sample regression curves (green, purple) and the limits for the \(50\%\) (blue) and \(90\%\) (red) exact confidence bands (left). The same is done in the right plot by assuming that \(\theta \) is unknown

If \(\theta \) is unknown, the estimated confidence bands \([l_n,u_n]\) (\(50\%\)) and \([L_n,U_n]\) (\(90\%\)) depend on n. They are plotted (blue, red) in Fig. 2, right, for \(\theta =1\) by connecting the points obtained for \(n=1,\dots ,8\). We also include the estimated regression curves \({\hat{m}}_{n+1|n}\) (black) and \({\tilde{m}}_{n+1|n}\) (green) and the simulated sequence of paired records. These values are given in Table 2. In this sample, the real coverage probabilities for \([l_n,u_n]\) and \([L_n,U_n]\) are 0.625 (5-out-of-8) and 0.75 (6-out-of-8), respectively. The MLE predictions from \({\hat{\theta }}_n=n/R_n\) of \(\theta =1\) for \(n=1,\dots ,8\) are

$$\begin{aligned} 2.18813, 1.67499, 1.72239, 1.99603, 2.21403, 1.20574, 1.19758, 1.15796. \end{aligned}$$

They are used both in \({\hat{m}}_{n+1|n}(R_n)\) and \({\tilde{m}}_{n+1|n}(R_n)\). The mean squared errors (MSE) for these eight predictions are 12.39879 and 13.70644, respectively. Analogously, the mean absolute errors (MAE) are 2.684359 and 2.877943. However, in this case, the joint maximum likelihood predictor (MLP) \({\hat{R}}_{n+1}=R_n\), provides the best predictions with respective errors 9.747176 and 2.320351. Note that the MLP are always out of the centered confidence bands. If we use this joint MLP, they should be replaced with the bottom confidence bands

$$\begin{aligned} \left[ x_n,x_n-\log (0.5)\frac{1}{{\hat{\theta }}}\right] =\left[ x_n,x_n-\log (0.5)\frac{x_n}{n}\right] \end{aligned}$$

(\(50\%\)) and

$$\begin{aligned} \left[ x_n,x_n-\log (0.1)\frac{1}{{\hat{\theta }}}\right] =\left[ x_n,x_n-\log (0.1)\frac{x_n}{n}\right] \end{aligned}$$

(\(90\%\)), where \(x_n=R_n\).

Table 2 Predicted values \({\hat{R}}_{n+1}={\hat{m}}_{n+1|n}(R_n)\) and \({\tilde{R}}_{n+1}={\tilde{m}}_{n+1|n}(R_n)\) and centered confidence intervals \([l_n,u_n]\) (\(50\%\)) and \([L_n,U_n]\) (\(90\%\)) for the first nine records from a standard exponential distribution when \(\theta \) is unknown. \(R_{n+1}\) represents the exact values for \(n=1,\dots ,8\)

Analogously, from (16), the exact median regression curve to predict \(R_{n+k}\) from \(R_n=x_n\) (or from \(R_1=x_1,\dots ,R_n=x_n\)) is

$$\begin{aligned} m_{n+k|n}(x_n)={\bar{F}}^{-1}(c_k(0.5) {\bar{F}}(x_n))=x_n-\frac{1}{\theta } \log (c_k(0.5)) , \end{aligned}$$
(23)

where \(c_k\) is given in (17) (some of these values were given in (18)). The confidence bands and their estimated versions can be obtained in a similar way. For example, the EMRC is

$$\begin{aligned} {\hat{m}}_{n+k|n}(x_n)=x_n-\frac{1}{{\hat{\theta }}_n} \log (c_k(0.5))=x_n-\frac{x_n}{n} \log (c_k(0.5)). \end{aligned}$$
(24)

Remark 1

Let us compare the different predictors for \(R_s\) from \(R_n\) in this model. The best linear unbiased predictor (BLUP) is \({\hat{R}}_s^{(1)}=\frac{s}{n} R_n\), the best linear invariant predictor (BLIP) is \({\hat{R}}_s^{(2)}=\frac{s+1}{n+1} R_n\), and the maximum likelihood predictor (MLP) is \({\hat{R}}_s^{(3)}=\frac{s}{n+1} R_n\). They were compared in Awad and Raqab (2000) as well. The predictor obtained from Volovskiy and Kamps (2020a) for this model is \({\hat{R}}_s^{(4)}=\frac{s-1}{n} R_n\). Finally, the one obtained from (24) is \({\hat{R}}_s^{(5)}=(1-\frac{1}{n} \log (c_k(0.5))R_n\). We obtain \(m=1000\) simulated values for \(R_n\) and \(R_s\) with \(\theta =1\), \(n=1\) and \(s=2,3,4,5\). To compare the predictions we use the mean squared error

$$\begin{aligned} MSE_s(i)=\frac{1}{m} \sum _{j=1}^m (R_s(j) - {\hat{R}}_s^{(i)} (j))^2 \end{aligned}$$

and the mean absolute error

$$\begin{aligned} MAE_s(i)=\frac{1}{m} \sum _{j=1}^m |R_s(j) - {\hat{R}}_s^{(i)}(j)| \end{aligned}$$

for \(i=1,2,3,4,5\), where \(R_s(j)\) and \({\hat{R}}_s^{(i)}(j)\) represent the exact and estimated values obtained in the \(m=1000\) simulations for \(j=1,\dots ,m\). The MSE and MAE obtained are given in Table 3. In this case (and with these error measures), the best predictions are obtained with the BLIP (\(i=2\)). For more comparisons see Raqab and Nagaraja (1995) and Volovskiy and Kamps (2020a, 2020b).

Table 3 \(MSE_{s}(i)\) and \(MAE_{s}(i)\) for the predictors considered in Remark 1 for the record values \(R_s\) and \(R_1\) in an exponential model with \(\theta =1\)

3.4 Pareto distribution

Another popular model studied in the literature of records is the Pareto (type II) model with survival function \({\bar{F}}_{\theta }(x)=(1+x)^{-\theta }\) and hazard rate function \(h(x)=\theta /(1+x)\) for \(x\ge 0\) and \(\theta >0\). It is also included in the PHR model.

If \(\theta \) is known, from (11), the exact median regression curve to predict \(R_{n+1}\) from \(R_n=x_n\) (or from \(R_1=x_1,\dots ,R_n=x_n\)) is

$$\begin{aligned} m_{n+1|n}(x_n)={\bar{F}}_{\theta }^{-1}\left( 0.5{\bar{F}}_{\theta }(x_n)\right) =-1+(1+x_n)0.5^{-1/\theta } \end{aligned}$$
(25)

since \({\bar{F}}_{\theta }^{-1}(x)=-1+x^{-1/\theta }\) is the inverse function of \({\bar{F}}_\theta \). The exact \(50\%\) and \(90\%\) confidence bands are obtained in a similar way as

$$\begin{aligned} \left[ -1+(1+x_n)0.75^{-1/\theta },-1+(1+x_n)0.25^{-1/\theta }\right] \end{aligned}$$

and

$$\begin{aligned} \left[ -1+(1+x_n)0.95^{-1/\theta },-1+(1+x_n)0.05^{-1/\theta }\right] , \end{aligned}$$

respectively. They are plotted in Fig. 3 (left) for \(\theta =2\) jointly with the sequence of the first nine records obtained by simulation:

$$\begin{aligned}&0.40174, 0.42488, 0.49916, 1.48359, 4.0376, 4.90751,\\&10.24807, 39.27022, 49.69001. \end{aligned}$$

Note that the curve and the regions do not depend on n.

In this case, the classical mean regression curve is

$$\begin{aligned} \tilde{m}_{n+1|n}(x_n)=E(R_{n+1}|R_n=x_n)=x_n+\int _{x_n}^\infty \left( \frac{1+x}{1+x_n}\right) ^{-\theta }dx=\frac{1+\theta x_n}{\theta -1} \end{aligned}$$

for \(\theta >1\). If \(0<\theta \le 1\) it does not exist (however the curve \(m_{n+1|n}\) in (25) always exists). The curve \({\tilde{m}}_{n+1|n}\) is also included in Fig. 3, left (green). Note that \(m_{n+1|n}\ne \tilde{m}_{n+1|n}\). Actually, in this case (\(\theta =2\)), \({\tilde{m}}_{n+1|n}\) coincides with the upper limit of the \(50\%\) confidence band (i.e. the third conditional quartile curve) since \(-1+0.25^{-1/\theta }(1+x_n)=2x_n-1\).

If \(\theta \) is unknown, from the general results for the PHR, the MLE for \(\theta \) is \({\hat{\theta }}_n=n/\log (1+x_n)\). This is a well known result (see, e.g., Paul and Thomas 2016). Note that again it only depends on \(x_n\). Therefore, we just replace \(\theta \) with \({\hat{\theta }}_n\) in (25) to get the EMRC for the Pareto model as

$$\begin{aligned} {\hat{m}}_{n+1|n}(x_n)=-1+(1+x_n)0.5^{-1/{\hat{\theta }}_n}=-1+(1+x_n)0.5^{-\frac{1}{n}\log (1+x_n)}. \end{aligned}$$

It is plotted in Fig. 3, right, for the nine simulated records stated above. In that plot we also include the estimated regression curves \({\tilde{m}}_{n+1|n}\) (green) and the estimated centered confidence bands \([l_n,u_n]\) (\(50\%\)) and \([L_n,U_n]\) (\(90\%\)) (that depend on n) obtained by connecting the points in the intervals for \(n=1,\dots ,8\). These values are given in Table 4. In this sample, the real coverage probabilities for \([l_n,u_n]\) and \([L_n,U_n]\) are 0.25 (2-out-of-8) and 0.5 (4-out-of-8), respectively (below the expected values). As in the exponential case, the joint MLP for \(R_{n+1}\) and \(\theta \) gives \({\hat{R}}_{n+1}=x_n\) and \(\theta = (n+1)/\log (1+x_n)\). The MSE for these eight predictions by using the EMRC, ERC and MLP are 89.88488, 141.3754 and 123.4542, respectively. Analogously, the MAE are 4.821381, 6.890559 and 6.161034. In this example, the EMRC provides the best fits.

Table 4 Predicted values \({\hat{R}}_{n+1}={\hat{m}}_{n+1|n}(R_n)\) and \({\tilde{R}}_{n+1}={\tilde{m}}_{n+1|n}(R_n)\) and centered confidence intervals \([l_n,u_n]\) (\(50\%\)) and \([L_n,U_n]\) (\(90\%\)) for the first nine records from a Pareto distribution when \(\theta =2\) and it is unknown

The MLE \({\hat{\theta }}_n=n/\log (1+R_n)\) for \(\theta =2\) that we obtain with the nine records are

$$\begin{aligned} 2.96111, 5.64834, 7.40914, 4.39704, 3.09228, 3.37795, 2.89233, 2.16473, 2.29257. \end{aligned}$$

Therefore, the prediction for the next upper record from the EMRC is

$$\begin{aligned} {\hat{m}}_{10|9}(R_9)=-1+(1+49.690004)0.5^{-1/2.292568} =67.58501. \end{aligned}$$

Analogously, the one obtained with the estimated regression curve is 88.90651 and the estimated centered confidence intervals are [56.46714, 91.79747] \((50\%)\) and \([50.83691, 186.2498]\,\, (90\%)\).

Fig. 3
figure 3

Plots of the paired records \((R_n,R_{n+1})\) from a Pareto distribution with \(\theta =2\), jointly with the median regression curve (black), the mean regression curve (green) and the \(50\%\) (blue–green) and \(90\%\) (red) exact confidence bands (left). The same is done in the right plot by assuming that \(\theta \) is unknown

Analogously, from (20), the median regression curve to predict \(R_{n+k}\) from \(R_n=x_n\) is

$$\begin{aligned} m_{n+k|n}(x_n)={\bar{F}}_\theta ^{-1}(c_k(0.5) {\bar{F}}_\theta (x_n))=-1+(c_k(0.5))^{-1/\theta } (1+x_n), \end{aligned}$$

where \(c_k\) is given in (17). The confidence bands and their estimated versions are obtained in a similar way by replacing \(\theta \) with the MLE \({\hat{\theta }}_n=n/{\log (1+R_n)}\).

3.5 Non-parametric predictions

If \({\bar{F}}\) is unknown and we know the sequence of original values \(X_1, \dots ,X_N\) which lead to the record values \(R_1,\dots ,R_n\) (\(n\le N\)), then the best option is to use the empirical (or kernel) estimator for F in (11) and (16). With the empirical estimator, the survival function \({\bar{F}}\) is approximated with

$$\begin{aligned} S(x)=\frac{1}{N}\sum _{i=1}^N 1(X_i>x) \end{aligned}$$

and its inverse with

$$\begin{aligned} S^{-1}(x)= X_{i:N} \end{aligned}$$

if \(N-i\le Nx<N-i+1\) for \(i=1,\dots ,N\), where \(X_{1:N},\dots ,X_{N:N}\) represent the ordered data. Therefore, the EMRC from \(X_1, \dots ,X_N\) and (11) is

$$\begin{aligned} {\hat{m}}_{n+1|n}(x)=S^{-1}(0.5S(x)). \end{aligned}$$

The estimated confidence regions and the estimated median regression curve \({\hat{m}}_{n+k|n}\) are obtained in a similar way.

For example, if we use the data which lead to the record values in Sect. 3.1 (uniform distribution), then we obtain EMRC and EQCB very similar to that plotted in Fig. 1 (since \(N=20000\)). If we use the data in Sect. 3.3 (exponential distribution), then we obtain the EMRC and the EQCB plotted in Fig. 4, left (black and blue), where we also include the sequence of records (symbols ‘\(+\)’), the exact curves (red) and the EMRC obtained with the MLE of \(\theta \) with \(n=9\) (green) when we know that the model is an exponential distribution. In Fig. 4, right, we plot the respective curves from the Pareto sample in Sect. 3.4.

Fig. 4
figure 4

Plots of the paired records \((R_n,R_{n+1})\) from a standard exponential distribution (left) and a Pareto distribution (right), jointly with the non-parametric predictions of the median regression curves and the confidence bands (black and blue), the exact ones (red) and the EMRC curve with the MLE of \(\theta \) (green)

If \({\bar{F}}\) is unknown and we just know the sequence of record values \(R_1,\dots ,R_n\), then \({\bar{F}}\) should be estimated from the likelihood function given in (3). By using the inversion formula for the hazard rate function, it can be written as

$$\begin{aligned} \ell =h(x_1)\dots h(x_n)\exp \left( -\int _{-\infty }^{x_n}h(x)dx \right) . \end{aligned}$$

If

$$\begin{aligned} {\tilde{\ell }}=\log \ell = \sum _{i=1}^n \log h(x_i) -\int _{-\infty }^{x_n}h(x)dx , \end{aligned}$$
(26)

we have to maximize \({\tilde{\ell }}\) for the functions h in the set of all the hazard rate functions of absolutely continuous distributions. This is a difficult task. Let us see some possible approximate solutions.

If the left-end point of the support \(x_0=\inf \{x: F(x)>0\}\) is finite, then the integral in the expression of L can be approximated by

$$\begin{aligned} \int _{x_0}^{x_n}h(x)dx \approx \sum _{i=1}^n (x_i-x_{i-1})h(x_i). \end{aligned}$$
(27)

This is equivalent to assume that the hazard rate is constant but that this constant can change at the record values (a reasonable assumption in practice). With this approximation we have

$$\begin{aligned} {\tilde{\ell }}\approx \sum _{i=1}^n \left[ \log h(x_i) -(x_i-x_{i-1})h(x_i) \right] = \sum _{i=1}^n \phi _i(h(x_i)), \end{aligned}$$
(28)

where \(\phi _i(u)=\log u -(x_i-x_{i-1})u\), \(\phi ^\prime _i (u)=1/u -(x_i-x_{i-1})\) and \(\phi ^{\prime \prime }_i (u)=-1/u^2<0\). Therefore, the maximum value in (28) is obtained when

$$\begin{aligned} {\hat{h}}(x_i)=\frac{1}{x_i-x_{i-1}} \end{aligned}$$

for \(i=1,\dots ,n\). If we define \({\hat{h}}(x)\) as \({\hat{h}}(x_i)\) for \(x_{i-1}<x\le x_i\) and \(i=1,\dots ,n\), then \({\bar{F}}\) can be estimated with

$$\begin{aligned} S(x)&= \exp \left( -\int _{x_0}^x {\hat{h}}(z)dz \right) \nonumber \\&=\exp \left( 1-j-(x-x_{j-1}){\hat{h}}(x_j) \right) \nonumber \\&=\exp \left( 1-j-\frac{x-x_{j-1}}{x_j-x_{j-1}}\right) \end{aligned}$$
(29)

if \(x_{j-1}<x\le x_j\) (\(S(x)=1\) for \(x\le x_0\) and \(S(x)=0\) for \(x> x_n\)). Therefore its inverse is estimated with the inverse of S given by

$$\begin{aligned} S^{-1}(y)=x_{j-1}+(x_j-x_{j-1})(1-j-\log y) \end{aligned}$$

for \(e^{-j}\le y<e^{1-j}\) and \(j=1,\dots ,n\). If \(0<y<e^{1-n}\), it can be defined as \(S^{-1}(y)=x_n\). Then we can use the predictions in (11) and (16) to get approximations of the median regression curve and the associated confidence regions.

We do that for the simulated record values obtained above from the exponential and Pareto models. Thus we get the estimated curves plotted in Fig. 5 (continuous lines). The dashed lines represent the exact curves (see Sects. 3.3 and 3.4). The green dashed lines represent the EMRC obtained with the MLE for \(\theta \). Note that the EMRC obtained from S fit better to the record values. This is due to the bias induced by the fact that S is computed from these record values (the same happen, e.g., in the linear regression models). We cannot expect the same accuracy for new records. Actually, the estimated curves based on the MLE fit better to the exact curves than the ones obtained from the non-parametric method (as expected).

Fig. 5
figure 5

Plots of the paired records \((R_n,R_{n+1})\) from a standard exponential distribution (left) and a Pareto distribution (right), jointly with the non-parametric predictions (continuous lines) obtained from (29) of the median regression curve (black) and the \(50\%\) and \(90\%\) confidence bands (blue, red). The dashed lines represent the respective exact curves and the green dashed lines the EMRC from the MLE of \(\theta \)

We can use other options to maximize \({\tilde{\ell }}\) in (26). For example, we could use a better approximations for the integral in that expression than that used in (27).

Another popular option is to restrict the set of possible hazard functions. For example, we might assume only linear functions, that is, \(h(x)=ax+b\) for \(x\ge 0\) and \(a,b\ge 0\). This is equivalent to consider the MLE method applied to the Linear Hazard Rate (LHR) model with \({\bar{F}}(x)=\exp (-bx-ax^2/2)\) for \(x\ge 0\). This model contains the exponential model (\(a=0\)), some Weibull models (\(b=0\)) and it is equivalent to the normal (Gaussian) model when \(a>0\) and \(x\rightarrow \infty \). Thus, from (26), we need to maximize

$$\begin{aligned} {\tilde{\ell }}(a,b)= & {} \sum _{i=1}^n \log (ax_i+b) -\int _{0}^{x_n}(ax+b)dx \nonumber \\= & {} -\frac{1}{2}ax^2_n-bx_n+\sum _{i=1}^n \log (ax_i+b). \end{aligned}$$
(30)

The partial derivatives are

$$\begin{aligned} \partial _1 {\tilde{\ell }}(a,b)=-\frac{1}{2} x^2_n+\sum _{i=1}^n \frac{x_i}{ax_i+b} \end{aligned}$$
(31)

and

$$\begin{aligned} \partial _2 {\tilde{\ell }}(a,b)=- x_n+\sum _{i=1}^n \frac{1}{ax_i+b}. \end{aligned}$$
(32)

So we have to solve the equations

$$\begin{aligned} \sum _{i=1}^n \frac{x_i}{ax_i+b}=\frac{x^2_n}{2} \end{aligned}$$
(33)

and

$$\begin{aligned} \sum _{i=1}^n \frac{1}{ax_i+b}={x_n}. \end{aligned}$$
(34)

We can use numerical methods to solve these equations. For example, we could apply the bivariate Newton-Raphson (NR) method (see, e.g., Gil et al. 2007) by using (31) and (32). Thus, we choose an initial point \((a_0,b_0)\) (e.g. \(a_0=b_0=1\)) and then we compute the next points as

$$\begin{aligned} \left( a_{m+1},b_{m+1}\right) '=\left( a_{m},b_{m}\right) '-H^{-1}\left( a_m,b_m\right) V\left( a_m,b_m\right) \end{aligned}$$

for \(m=0,1,\dots \), where \(v'\) denotes the transpose of v,

$$\begin{aligned} V\left( a_m,b_m\right) =\left( \partial _1{\tilde{\ell }} \left( a_m,b_m\right) ,\partial _2{\tilde{\ell }}\left( a_m,b_m\right) \right) ' \end{aligned}$$

is the vector with the first partial derivatives, \(H(a_m,b_m)=(\partial _{i,j} {\tilde{\ell }}(a_m,b_m))\) is the Hessian matrix with the second partial derivatives and \(H^{-1}(a_m,b_m)\) is its inverse matrix.

We apply this method to the first eleven records from a model with hazard rate \(h(x)=2x+3\). After six iterations (\(m=6\)) we obtain the MLE predictions of \(a=2\) and \(b=3\), \({\hat{a}}=0.8406102\) and \({\hat{b}}= 4.7184054\) with a maximum value of \({\tilde{\ell }}({\hat{a}},{\hat{b}})=7.947086\). These estimations are not very good. However, the EMRC and the EQCB obtained from them, plotted in Fig. 6 (right), are very similar to the exact ones (given in the left plot).

Fig. 6
figure 6

Plots of the paired records \((R_n,R_{n+1})\) from a model with hazard rate \(h(x)=2x+3\) jointly with the exact (left) and NR predictions (right) of the median regression curve (black) and the \(50\%\) and \(90\%\) confidence bands (blue, red)

4 A case with a real data set

We consider the data set studied in Awad and Raqab (2000), Sect. 5. It contains sizes of rocks that should be crushed. The data were published in Dunsmore (1983). The sizes are

$$\begin{aligned} 9.3, 0.6, 24.4,18.1, 6.6, 9.0, 14.3, 6.6, 13, 2.4, 5.6, 33.8. \end{aligned}$$

Hence, the upper record values are \(R_1=9.3\), \(R_2=24.4\) and \(R_3=33.88\). The purpose is to predict \(R_4, R_5, \dots \) In Awad and Raqab (2000), it is assumed that the data come from an exponential distribution with an unknown parameter \(\theta \). Then, as we have seen before, the MLE for \(\theta \) is \({\hat{\theta }}_n=n/R_n\), obtaining the following predictions

$$\begin{aligned} {\hat{\theta }}_1=1/R_1=0.10752688,\ {\hat{\theta }}_2=2/R_2=0.08196721,\ {\hat{\theta }}_3=3/R_3= 0.08854782. \end{aligned}$$

The stability in that predictions may confirm that the exponential model is correct. The estimation with the complete sample is \({\hat{\theta }}=0.08350731\). Then, to predict \(R_4\) we can use the curves given in (21) and (22) with the estimation \({\hat{\theta }}_3=3/R_3= 0.08854782\). They are plotted in Fig. 7 (left, black and purple lines) jointly with the corresponding \(50\%\) (blue) and \(90\%\) (red) confidence bands for these predictions. We also plot the paired record sequence \((R_1,R_2)\) and \((R_2,R_3)\). As we can see they belong to the \(50\%\) confidence bands. Therefore, the exponential model cannot be rejected with these data. The respective predictions for \(R_4\) are 41.70794 (black) and 45.17333 (purple) and the confidence intervals are [37.12889, 49.53588] (\(50\%\)) and [34.45927, 67.7118] (\(90\%\)).

Fig. 7
figure 7

Plots of the paired records \((R_n,R_{n+k})\) for \(k=1\) (left) and \(k=2\) (right) from the real data set in Dunsmore (1983) jointly with the predictions of the median and mean regression curves (black and purple) and the \(50\%\) and \(90\%\) confidence bands (blue, red) under an exponential model

If we want to predict \(R_5\) from \(R_3\), we can use the curve obtained from (23)

$$\begin{aligned} {\hat{m}}_{n+2|n}(x_n)=x_n-\frac{1}{{\hat{\theta }}} \log (c_k(0.5)) = x_n+1.678347\frac{1}{{\hat{\theta }}}. \end{aligned}$$

In our case, with \({\hat{\theta }}_3\), we get \(m_{5|3}(x_3)=x_3+18.95413\) (see Fig. 7, right, black line) which gives the prediction \({\hat{R}}_5= m_{5|3}(33.8)=52.83413\). In this figure we also provide the estimated centered confidence bands [44.73604, 64.28882] (\(50\%\)) and [37.89322, 87.45404] (\(90\%\)) for \(R_5\). The only paired data available \((R_1,R_3)\) is also plotted. Note that it belongs to the \(50\%\) confidence region.

5 A case study in reliability

In this section we show a simulated case study in reliability. We assume that in a production process, every day, a random chosen unit is put into operation (under a normal test or an accelerated test procedure). Let \(X_1, X_2,\dots \) be the IID lifetimes of these units (think, e.g., on lifetimes of batteries in mobile phones or cars). After c days, only some of these values are available in a complete form, that is, are not censored. For example, after \(c=3\) days, \(X_1\) is known if and only if \(X_1<3\). In general, after c days, \(X_j\) is available if and only if \(X_j<c-j+1\) (or \(X_j+j-1<c\)). In other words, \(X_j\) is known the \(k_j\)-th day, with \(k_j=[ X_j+j]\), where [x] represents the integer part of x. The procedure described below based on these assumptions can be adapted to other (similar) sampling schemes. The key idea is to use lower record values because they will be available before and with a small proportion of censored data.

We want to use this process to study early failures. Therefore, we can assume that the units do not have aging and that their lifetimes follow an exponential distribution F with an unknown mean \(\mu >0\) and constant hazard rate \(h(t)=\theta =1/\mu \) for \(t\ge 0\). The purpose is to estimate \(\theta \) (or \(\mu \)) and the lower record values. Here it is important to discriminate small values from F and outliers (i.e. small values not obtained from F). To illustrate this process, we simulate 100 data. The 36 available data lifetimes after \(c=75\) days are given in Table 5 (in order of availability). The other data are censored. The available lower record values in this sequence are in bold case.

Table 5 Available lifetimes \(X_I\) after 75 days in order of availability. The variable I represents the index in the original sample

The incomplete (censored) sequence of lower records obtained from this table is \(R_1^*=75^+\), \(R_2^*=21.59455\), \(R_3^*=3.848409\) and \(R_4^*=2.367513\). The complete sequence of lower records from these 100 data points is 89.21383, 21.59455,  3.848409, 2.367513. To get the exact value of \(X_1\) we have to wait until the 90-th day. However, our censored sequence at day 75 is a good approximation of the exact sequence of lower records. To get the complete sample we have to wait until the 467-th day where \(X_{15}=452.5437\) will be available.

Instead, we want just to use the information available the 75-th day (given in Table 5). The likelihood function for the lower record values \((R_1^*,\dots ,R^*_n)\) in the exponential model is

$$\begin{aligned} \ell (\theta )={\bar{h}}(x_1)\dots {\bar{h}}(x_n)F(x_n)=\theta ^n e^{-\theta x_n}\prod _{j=1}^{n-1} \frac{e^{-\theta x_j}}{1-e^{-\theta x_j}}, \end{aligned}$$

where \({\bar{h}}=f/F\) is the reversed hazard rate. As a consequence, the joint distribution of \((R_1^*,\dots ,R^*_n)\) can also be written as a distorted distribution

$$\begin{aligned} {\mathbf {G}}\left( x_1,\dots ,x_n\right) =D\left( F(x_1),\dots ,F(x_n)\right) \end{aligned}$$

for \(x_1>\dots >x_n\), where the PDF d of D is equal to the one in (6). Actually, D coincides with the distortion function \({\hat{D}}\) in (4).

To estimate \(\theta \) from our censored sample with the MLE, we maximize \(\ell \) obtaining \({\hat{\theta }}=0.01645\) (or \({\hat{\mu }}=60.77\)). If we wait until the 90-th day to get the complete lower record sample, then the MLE prediction is \({\hat{\theta }}=0.01437\) (or \({\hat{\mu }}=69.55\)).

The conditional distribution of \((R^*_{n+1}|R_1^*=x_1,\dots ,R_n^*=x_n)\) is

$$\begin{aligned} F\left( x|x_1,\dots ,x_n\right) =\frac{F(x)}{F(x_n)} \end{aligned}$$
(35)

for \(x\le x_n\). Therefore, the median regression curve to predict \(R^*_{n+1}\) from \(R_n^*\) is

$$\begin{aligned} m^*_{n+1|n}(x_n)=F^{-1}\left( 0.5F(x_n)\right) \end{aligned}$$

and in the case of an exponential distribution we obtain

$$\begin{aligned} m^*_{n+1|n}(x_n)=-\frac{1}{\theta }\ln \left( 0.5+0.5\exp \left( -\theta x_n\right) \right) . \end{aligned}$$

It is plotted in Fig. 8 by replacing \(\theta \) with the predictions obtained above from the censored (left) and uncensored (right) samples. We also include in the plots the sequences of paired lower records and the centered \(50\%\) confidence band \([F^{-1}(0.25F(x_n)),F^{-1}(0.75F(x_n))]\) and \(90\%\) confidence band \([F^{-1}(0.05F(x_n)),F^{-1}(0.95F(x_n))]\). The respective predictions for the next lower record \(R_5^*\) are 1.1722 and 1.17.36 (they are also included in the plots). We also plot the exact curves (dashed lines) and thus we can see that the predictions are good (in both cases).

Fig. 8
figure 8

Plots of the paired lower records \((R^*_n,R^*_{n+1})\) for the censored (left) and uncensored (right) samples from the data in Table 5 jointly with the predictions of the median and mean regression curves (black and purple) and the \(50\%\) and \(90\%\) confidence bands (blue, red) under an exponential model. The dashed lines represent the exact curves (obtained with \(\theta =0.01\))

To predict the next lower record we can also use the mean regression curve obtained from (35) as

$$\begin{aligned} {\tilde{m}}^*_{n+1|n}(x)=E(R^*_{n+1}|R^*_n=x_n)=\frac{1}{\theta }-\frac{x\exp (-\theta x)}{1-\exp (-\theta x)} \end{aligned}$$

for \(x\ge 0\). It is also plotted in Fig. 8 (purple lines) for the different estimations of \(\theta \). The predictions from this curve for \(R_5^*\) are 1.176064 (censored sample, day 75) and 1.177034 (complete record sample, day 90).

The available data in Table 5 can also be used to get other lower record sequences. For example, starting with \(R_1^*=X_4=69.04814\), we get the sequence \(R_2^*=X_7=45.24274\), \(R_3^*=X_8=3.848409\) and \(R_4^*=X_{23}=2.367513\). With this sequence, the MLE for \(\theta \) gives \({\hat{\theta }}=0.0144\) (or \({\hat{\mu }}=69.515\)). These predictions can be used to improve the estimated curves plotted above.

6 Conclusions

We have proposed a very general method to predict record values from preceding records. We focused on upper records but the results for lower records are completely analogous (see Sect. 5). These predictions are based on conditional quantiles which can also be used to provide confidence bands. We first obtain the theoretical expressions and we then show how to apply them in practice in three different cases: a given probability (distribution) model, a given parametric model with an unknown parameter and a non-parametric method for arbitrary models. All these cases show that the proposed method is a good way to manage the uncertainty in the prediction of future record values. The confidence bands are more appropriate to this purpose than point estimators.

This paper is just a first step based on the concept of multivariate distorted distribution introduced in Navarro et al. (2020). The present study for records should be completed by considering MLE for other relevant models (we just study here uniform, PHR, exponential and Pareto models). Other estimation procedures can be considered as well (see e.g., Volovskiy and Kamps 2020a, and the references therein). The non-parametric estimations could be improved by using kernel estimators (that provide continuous curves). Moreover, similar techniques can be used to predict other ordered values such as order statistics (see, Navarro et al. 2020), sequential order statistics, generalized order statistics, time series and sequences of non-identically distributed random variables. These topics are left for future research projects.