Prediction of record values by using quantile regression curves and distortion functions

The purpose of the paper is to provide a general method based on conditional quantile curves to predict record values from preceding records. The predictions are based on conditional median (or median regression) curves. Moreover, conditional quantiles curves are used to provide confidence bands for these predictions. The method is based on the recently introduced concept of multivariate distorted distributions that are used instead of copulas to represent the dependence structure. This concept allows us to compute the conditional quantile curves in a simple way. The theoretical findings are illustrated with a non-parametric model (standard uniform), two parametric models (exponential and Pareto), and a non-parametric procedure for the general case. A real data set and a simulated case study in reliability are analysed.


Introduction
The prediction of future record values is a relevant topic nowadays. Several papers had studied and applied this topic to different problems (climate change, sports, etc.). In several practical situations, the usual point predictions obtained with linear regression do not provide good record predictions. The basic tools in this field can be seen in Dunsmore (1983), Raqab and Nagaraja (1995), Awad and Raqab (2000) and Nevzorov (2001) and the recent advances in Paul and Thomas (2016), Guo et al. (2020), Volovskiy and Kamps (2020a, b). Specific results for records from exponential distributions can be found in Awad and Raqab (2000), Basak and Balakrishnan (2003) Raqab (2007), Volovskiy and Kamps (2020a, b) and the references therein. Analogously, the prediction of record values from a Pareto model were studied in Raqab et al. (2007), Volovskiy (2018) and Volovskiy and Kamps (2020b). Moreover, the prediction tools for generalized order statistics can also be applied to record values (see, Burkschat 2009).
The conditional median (or quantile) regression curves are a good alternative to the classical (mean) regression tool. The theoretical quantile regression curve to predict a response variable Y from an explanatory variable X can be obtained from the copula of (X , Y ) (see, e.g., p. 217 in, Nelsen 2006). In practice, we can estimate the unknown parameters (in the copula or in the marginal distributions) or to use the empirical linear or non-linear estimators proposed in Koenker and Bassett (1978) and Koenker (2005). The package quantreg for the statistical program R can help us in this purpose. Moreover, the quantile regression curves can also be used to obtain confidence bands for the predictions and bivariate box plots, see Koenker and Bassett (1978) and Navarro (2020).
The aim of the paper is to use conditional median (or quantile) regression curves to predict future record values. Instead of use the general tools, we develop a specific procedure for upper record values. This procedure is based on the concept of multivariate distorted distributions introduced recently in Navarro et al. (2020). It is an alternative method to the classic copula approach to represent the dependence structure. A similar method can be used for lower record values (see Sect. 5). This is a novel approach that simplify the expressions for the quantile curves.
The rest of the paper is scheduled as follows. The main results are given in Sect. 2. These general results are applied to three models in Sect. 3. A fixed model. We consider a uniform distribution but a similar method holds for other fixed distribution functions. A parametric model, the popular proportional hazard rate (PHR) model with applications to exponential and Pareto models. And finally, we also consider a nonparametric approach. A (classical) real data set and a simulated data set of lifetimes in a specific sampling procedure in reliability are studied in Sects. 4 and 5, respectively. The conclusions are placed in Sect. 6.
If f is a real valued function with n variables, then ∂ i f denotes the partial derivative of f with respect to its i-th variable. Analogously, ∂ i, j f = ∂ i ∂ j f and so on. Whenever we use these partial derivatives, we are tacitly assuming that they exist.

Main results
Let (X n ) n be an infinite sequence of independent and identically distributed (IID) random variables with an absolutely continuous cumulative distribution function F, survival (reliability) functionF = 1 − F and probability density function (PDF) f = F .
An observation X j is called an upper record value in this sequence if it is greater than all previously observed values X 1 , . . . , X j−1 . More specifically, the upper record times are defined as T (1) = 1 and T (n + 1) = min{ j : X j > X T (n) } for n = 1, 2, . . . . Then the sequence (R n ) n of upper record values is defined by R n = X T (n) for n = 1, 2, . . . . The lower record values (R * n ) n are defined in a similar way. The purpose of the paper is to predict R s from R 1 , . . . , R n (or just from a single R i with i ≤ n) for s > n.
It is well known (see, e.g., Nevzorov 2001, p. 65) that the survival functionḠ n (t) = Pr(R n > t) of the n-th upper record value is given bȳ whereq for u ∈ [0, 1] and n = 1, 2, . . . . The functionq n : [0, 1] → [0, 1] is continuous, increasing and satisfiesq n (0) = 0 andq n (1) = 1. It is called dual distortion function, see Hürlimann (2004). A similar representation holds for the distribution function G n of R n that can be written as The function q n is called the distortion function and G n is a distorted distribution from F. The distorted distributions were introduced in Yaari (1987) in the context of the theory of choice under risk. Some applications and ordering results can be seen in Wang (1996) and Navarro et al. (2013). Then, the PDF g n of R n can be obtained as (see, e.g., Awad and Raqab 2000). Even more, the joint PDF g of (R 1 , . . . , R n ) is given by for Nevzorov 2001, p. 68), where h = f /F is the hazard (or failure) rate function associated to F. Hence, from (3), we obtain the following result.
A similar representation can be obtained for the joint distribution function G, that is, for a continuous increasing function D : [0, 1] n → [0, 1] which does not depend on F. Note that D can be computed fromD (and vice versa). We must say that the explicit expression ofD (or D) is complicated. However, the expression forĈ is much more complicated (some examples are given below).
These representations are similar to the classic copula representations (see, e.g., Nelsen 2006) but here D andD are not necessarily copulas and F is not the marginal distribution of R i for i = 2, . . . , n. Note that, as in (1), the joint survivalḠ is written in (4) as a distortion of the univariate survival functionF at different points.
This kind of representations were called multivariate distorted distributions in Navarro et al. (2020). It can be proved that D andD can be extended to R n and then they are multivariate distribution functions with supports included in [0, 1] n . Analogously, the joint PDF of (R 1 , . . . , R n ) can be written as follows.
Proof The expression of the PDF g in (5) can be obtained by differentiating in (4). Then the explicit expression ford can be obtained from (3) and (5).
Therefore, the explicit expression forD can be obtained from the expression for d given in (6). Moreover, the different marginal distributions of (R 1 , . . . , R n ) have also multivariate distorted distributions (see, Navarro et al. 2020). For example, if 1 ≤ i < j ≤ n, then the joint survival functionḠ i, j of (R i , R j ) can be written from (4) asḠ whereD i, j (u, v) =D(1, . . . , 1, u, 1, . . . , 1, v, 1, . . . , 1) and u and v are placed at the i-th and j-th variables, respectively. Hence, to predict R j from R i for i < j, we can use here a similar technique to that developed in copula theory to compute the median regression curve. The result can be stated as follows.

Proposition 3 The conditional survival functionḠ j|i of
Proof From (7), the joint PDF of (R i , R j ) is Therefore, from (2), the conditional PDF of (R j |R i = x i ) is where, in the last equality, we use that lim v→0 + ∂ 1Di, j (u, v) = 0 for all 0 < u < 1 and that 0 <F(x i ) < 1.
As a consequence, the median regression curve to predict R j from R i is whereḠ −1 j|i is the inverse function ofḠ j|i . This procedure can also be used to determine quantile confidence bands for this prediction. For example the centered 50% quantile confidence band is given by and the centered 90% quantile confidence band by To get these predictions from (8) we need to computeD (orD i, j ) and to estimatē F. In the following subsections, we consider different relevant cases.

Case i = 1 and j = 2
A straightforward calculation from (3) proves that the joint survival functionḠ 1,2 of (R 1 , R 2 ) can be written as for , that is, it is the dual distortion function of the second upper record given in (1). Also note thatF is equal to the first marginal survival distribution but it is not equal to the second one (so (10) is not a copula representation).
To get the copula (or the survival copula) representation of (R 1 , R 2 ) we need the explicit expression of the inverse function of G 2 . The distortion functionD 1,2 can also be obtained from Proposition 2 aŝ for x 2 ≥ x 1 and x 1 such that 0 <F(x 1 ) < 1 and f (x 1 ) > 0. This a very well known result (see, e.g., Nevzorov 2001, p. 68). Then, the median regression curve to predict whereF −1 is the inverse function ofF. Analogously, the centered 50% and 90% quantile confidence bands for this prediction are given by and They are plotted in Sect. 3 for different survival functionsF.

Case i = n and j = n + 1
First, let us see how the joint survival functionḠ n,n+1 of (R n , R n+1 ) can be written as a distortion from the univariate survival functionF.

Proposition 4
The joint survival functionḠ n,n+1 of (R n , R n+1 ) can be written as is the survival function of a gamma distribution with scale parameter equal to one and shape parameter equal to n + 1.
Therefore, from (12), we get for x n+1 ≥ x n and x n such thatF(x n ) > 0 and f (x n ) > 0. This expression is also a very well known result (see, e.g., Nevzorov 2001, p. 68). Even more, we can see that the record values form a Markov chain, that is, Therefore the median regression curve to predict R n+1 from R n (or from R 1 , . . . , R n ) is the same as that obtained in the preceding subsection (see (11)) and so are the confidence regions. This is a good point since, if F is known, then the median regression curve and the confidence regions do not depend on n and the sequence of paired records (R n , R n+1 ) for n = 1, 2, . . . can be plotted in the same graph (since they have the same conditional distribution). This is not the case if we need to estimate F or a parameter in F (see the different examples in Sect. 3).

Case i = m and j = n
In the general case the joint survival functionḠ m,n of (R m , R n ) for 1 ≤ m < n can also be written as a distortion fromF. The result is stated in the following proposition.

Proposition 5
The joint survival functionḠ m,n of (R m , R n ) for 1 ≤ m < n can be written asḠ for 1 > u ≥ v > 0 andγ k is the survival function in (13).
Proof From (3) (or from (3.2) in, Awad and Raqab 2000) the joint PDF of (R m , R n ) is for x m ≤ x n as stated above.
Therefore, from (14), we obtain for x n ≥ x m since lim v→0 + ∂ 1Dm,n (u, v) = 0 for all 0 < u < 1. Note that (15) proves thatḠ n|m is also a distorted distribution sincē Therefore, the median regression curve to predict R n from R m = x m is given by whereF −1 is the inverse function ofF, and γ −1 k is the quantile function of a gamma distribution with shape parameter k and scale parameter equal to one.
Analogously, the 50% and 90% quantile confidence bands for this prediction are given by Note that they only depend onF and on k = n − m. Hence, ifF is known, then we have common confidence bands for the sequence of paired records (R 1 , R 1+k ), (R 2 , R 2+k ), . . . They are plotted in Sect. 3 for different survival func-tionsF. This is not the case if we estimateF (or a parameter inF). Of course, note that these general formulas can be used to obtain the particular formulas included in the preceding subsections.

Examples
We consider here some classical models to show how to proceed when F is known or when F θ is known but it contains an unknown parameter θ . We also study the case in which F is completely unknown.

Uniform distribution
As we have mentioned before, if F is known, then the predictions and the confidence bands do not depend on n. Hence we can plot the paired records, the predictions for new records and the confidence bands in the same figure. Let us see some examples.
The simplest case is a standard uniform distribution with F(x) = x for 0 ≤ x ≤ 1. Hence, from (11), the median regression curve to predict R n+1 from R n = x n (or from  Fig. 1 Plots of the paired records (R n , R n+1 ) (left) and (R n , R n+2 ) (right) from a standard uniform distribution jointly with the median regression curve (black) and the limit for the 50% (blue) and 90% (red) confidence bands The classical mean regression curve can be obtained as that is, in this case, both curves coincide. Analogously, the 50% and 90% confidence bands for the predictions are They are plotted in Fig. 1 (left) jointly with m n+1|n and the sequence of paired records (R 1 , R 2 ), . . . , (R 8 , R 9  The predicted values m n+1|n (R n ) and the confidence intervals [l n , u n ] (50%) and [L n , U n ] (90%) for these records are given in Table 1. Note that 5-out-of-8 points belong to the 50% confidence interval (a little bit above the expected coverage) and just 7-out-of-8 belong to the 90% confidence interval (a little bit below the expected coverage). We can compare the prediction for R 9 with that obtained from the (sample) regression line m(x) = 0.6688 + 0.3097x based on the pairs (R 1 , R 2 ), . . . , (R 7 , R 8 ). This prediction is m(R 8 ) = 0.6688 + 0.3097 · 0.999815 = 0.9784427. The prediction in Table 1 is m n+1|n (R 8 ) = 0.9999075 (median regression curve) and the exact value is R 9 = 0.999959. The respective absolute errors are 0.02151629 (estimated regression line) and 0.0000515 (exact median regression curve). As expected the second prediction is better (since it takes into account the information about the model). Analogously, from (16), the median regression curve to predict R n+k from R n = x n (or from (17)). The values of c k (0.5) for k = 1, 2, 3, 4, 5 are 0.5, 0.186682309, 0.068971610, 0.025424023, 0.009363755.
The code in the statistical program R to get c k (u) is: exp(-qgamma(u,k)).
In a similar way, the 50% and 90% confidence bands for the predictions are They are plotted in Fig. 1 (right) for k = 2 jointly with m n+2|n and the sequence of the first seventh paired records (R 1 , R 3 ), . . . , (R 7 , R 9 ). The prediction obtained for the nineth record from the seventh one iŝ The same method can be applied to other (known) distributions. Some examples are given in the next subsections. Note that the quantile functions of the most usual models are included in statistical programs as R.

PHR model
If the baseline distribution function F θ has a known parametric form with an unknown parameter θ and R 1 = x 1 , . . . , R n = x n are known, then we can use (3) to get the likelihood function for θ as This expression can be used to obtain the maximum likelihood estimator (MLE)θ of θ . Numerical methods can be used to get this estimator.
Then, as in the preceding subsection, we can useFθ to get the estimated median regression curve and the estimated confidence bands replacingF withFθ in the theoretical results obtained in Sect. 2.
In this subsection we consider a very popular model in reliability and survival studies, the proportional hazard rate (PHR) Cox model. This model contains useful models such as exponential and Pareto distributions (studied in the following subsections). It is defined by a proportional hazard rate function h θ (x) = θ h(x), where h is a known hazard rate function and θ > 0 is an unknown parameter. Hence, its survival function isF θ (x) =F θ (x), whereF is the survival function associated to h. Hence, the likelihood function is To maximize is equivalent to maximizẽ Then the MLE of θ isθ Therefore, from (11), the exact median regression curve for the PHR model is sinceF −1 θ (y) =F −1 (y 1/θ ) is the inverse function ofF θ . If we replace θ with the MLÊ θ n given in (19), we obtain the maximum likelihood estimated median regression curve (EMRC) aŝ Analogously, the 50% and 90% estimated quantile confidence bands (EQCB) are and respectively.
In the general case, if we want to predict R n+k from R n for k > 0, from (16), the EMRC is where c k is defined in (17). The EQCB are obtained in a similar way by replacing 0.5 with 0.05, 0.25, 0.75, 0.95.

Exponential distribution
The exponential model with survival functionF θ (x) = exp(−θ x) for x ≥ 0 satisfies the PHR model with h(x) = 1 andF(x) = exp(−x). Then, from the results given in the preceding subsection, the MLE for θ isθ n = n/x n . This is a well known result (see, e.g., Awad and Raqab 2000). If θ is known, the exact median regression curve to predict R n+1 from R n = x n (or from If θ is unknown, we replace θ with the MLEθ n , obtaining the maximum likelihood EMRC asm The maximum likelihood estimated regression curve (ERC)m n+1|n (x n ) = Eθ (R n+1 |R n = x n ) can be obtained as that is, in this case, both curves (lines) do not coincide. Another option is to use the joint MLP for (θ, R n+1 ) from R 1 = x 1 , . . . , R n = x n . It is well known that this procedure leads toθ = (n + 1)/x n and the trivial estimator R n+1 = x n .
As for the exponential distribution both curvesm n+1|n (x n ) andm n+1|n (x n ) are straight lines, we could also use a (sample) linear mean (or quantile) regression to predict R n+1 from a sequence of paired records (as in Sect. 3.1).
From the general results for the PHR model, the 50% and 90% estimated quantile confidence bands (EQCB) for the exponential distribution are respectively. They were obtained previously in Awad and Raqab (2000). Note that they depend on n.
If θ is known, the exact confidence bands are obtained just by replacingθ n with θ . In this case, they do not depend on n and we can plot together the regression curves, the confidence bands and the sequence of paired records. We do so in Fig. 2, left, for θ = 1 including the exact regression curves and the sequence of the first eight paired records (R 1 , R 2 ), . . . , (R 8 , R 9 ). The simulated sequence of records R 1 , . . . , R 9 obtained is 0.45701, 1.19403, 1.74177, 2.00398, 2.25833, 4.97619, 5.84512, 6.90868, 14.16568.
The purple line represents the sample regression line m(x) = −0.2967 + 1.6335x obtained from (R 1 , R 2 ), . . . , (R 8 , R 9 ) which a very bad approximation of both theoretical regression lines (it is strongly determined by the data and it does not take into account the information about the model).
If θ is unknown, the estimated confidence bands [l n , u n ] (50%) and [L n , U n ] (90%) depend on n. They are plotted (blue, red) in Fig. 2, right, for θ = 1 by connecting the points obtained for n = 1, . . . , 8. We also include the estimated regression curveŝ m n+1|n (black) andm n+1|n (green) and the simulated sequence of paired records. These values are given in Table 2. In this sample, the real coverage probabilities for [l n , u n ] and [L n , U n ] are 0.625 (5-out-of-8) and 0.75 (6-out-of-8), respectively. The  Fig. 2 Plot of the paired records (R n , R n+1 ) from a standard exponential distribution jointly with the median regression curve (black), the theoretical and sample regression curves (green, purple) and the limits for the 50% (blue) and 90% (red) exact confidence bands (left). The same is done in the right plot by assuming that θ is unknown MLE predictions fromθ n = n/R n of θ = 1 for n = 1, . They are used both inm n+1|n (R n ) andm n+1|n (R n ). The mean squared errors (MSE) for these eight predictions are 12.39879 and 13.70644, respectively. Analogously, the mean absolute errors (MAE) are 2.684359 and 2.877943. However, in this case, the joint maximum likelihood predictor (MLP)R n+1 = R n , provides the best predictions with respective errors 9.747176 and 2.320351. Note that the MLP are always out of the centered confidence bands. If we use this joint MLP, they should be replaced with the bottom confidence bands x n , x n − log(0.5) 1 θ = x n , x n − log(0.5) x n n (50%) and x n , x n − log(0.1) 1 θ = x n , x n − log(0.1) x n n (90%), where x n = R n . Analogously, from (16), the exact median regression curve to predict R n+k from R n = x n (or from R 1 = x 1 , . . . , R n = x n ) is where c k is given in (17) (some of these values were given in (18)). The confidence bands and their estimated versions can be obtained in a similar way. For example, the Remark 1 Let us compare the different predictors for R s from R n in this model. The best linear unbiased predictor (BLUP) isR (1) s = s n R n , the best linear invariant predictor (BLIP) isR (2) s = s+1 n+1 R n , and the maximum likelihood predictor (MLP) iŝ R (3) s = s n+1 R n . They were compared in Awad and Raqab (2000) as well. The predictor obtained from Volovskiy and Kamps (2020a) for this model isR   Table 3. In this case (and with these error measures), the best predictions are obtained with the BLIP (i = 2). For more comparisons see Raqab and Nagaraja (1995) and Volovskiy and Kamps (2020a, b).
If θ is unknown, from the general results for the PHR, the MLE for θ isθ n = n/ log(1 + x n ). This is a well known result (see, e.g., Paul and Thomas 2016). Note that again it only depends on x n . Therefore, we just replace θ withθ n in (25) to get the EMRC for the Pareto model aŝ It is plotted in Fig. 3, right, for the nine simulated records stated above. In that plot we also include the estimated regression curvesm n+1|n (green) and the estimated centered confidence bands [l n , u n ] (50%) and [L n , U n ] (90%) (that depend on n) obtained by connecting the points in the intervals for n = 1, . . . , 8. These values are given in Table 4. In this sample, the real coverage probabilities for [l n , u n ] and [L n , U n ] are 0.25 (2-out-of-8) and 0.5 (4-out-of-8), respectively (below the expected values). As in the exponential case, the joint MLP for R n+1 and θ givesR n+1 = x n and θ = (n + 1)/ log(1 + x n  (20), the median regression curve to predict R n+k from R n = x n is where c k is given in (17). The confidence bands and their estimated versions are obtained in a similar way by replacing θ with the MLEθ n = n/log(1 + R n ).

Non-parametric predictions
IfF is unknown and we know the sequence of original values X 1 , . . . , X N which lead to the record values R 1 , . . . , R n (n ≤ N ), then the best option is to use the empirical   Fig. 3 Plots of the paired records (R n , R n+1 ) from a Pareto distribution with θ = 2, jointly with the median regression curve (black), the mean regression curve (green) and the 50% (blue-green) and 90% (red) exact confidence bands (left). The same is done in the right plot by assuming that θ is unknown (or kernel) estimator for F in (11) and (16). With the empirical estimator, the survival functionF is approximated with and its inverse with . . , N , where X 1:N , . . . , X N :N represent the ordered data. Therefore, the EMRC from X 1 , . . . , X N and (11) iŝ m n+1|n (x) = S −1 (0.5S(x)).
The estimated confidence regions and the estimated median regression curvem n+k|n are obtained in a similar way. For example, if we use the data which lead to the record values in Sect. 3.1 (uniform distribution), then we obtain EMRC and EQCB very similar to that plotted in Fig. 1  (since N = 20000). If we use the data in Sect. 3.3 (exponential distribution), then we obtain the EMRC and the EQCB plotted in Fig. 4, left (black and blue), where we also include the sequence of records (symbols '+'), the exact curves (red) and the EMRC obtained with the MLE of θ with n = 9 (green) when we know that the model is an exponential distribution. In Fig. 4, right, we plot the respective curves from the Pareto sample in Sect. 3.4.
IfF is unknown and we just know the sequence of record values R 1 , . . . , R n , then F should be estimated from the likelihood function given in (3). By using the inversion Fig. 4 Plots of the paired records (R n , R n+1 ) from a standard exponential distribution (left) and a Pareto distribution (right), jointly with the non-parametric predictions of the median regression curves and the confidence bands (black and blue), the exact ones (red) and the EMRC curve with the MLE of θ (green) formula for the hazard rate function, it can be written as we have to maximize˜ for the functions h in the set of all the hazard rate functions of absolutely continuous distributions. This is a difficult task. Let us see some possible approximate solutions.
If the left-end point of the support x 0 = inf{x : F(x) > 0} is finite, then the integral in the expression of L can be approximated by This is equivalent to assume that the hazard rate is constant but that this constant can change at the record values (a reasonable assumption in practice). With this approximation we havẽ and φ i (u) = −1/u 2 < 0. Therefore, the maximum value in (28) is obtained when . Therefore its inverse is estimated with the inverse of S given by for e − j ≤ y < e 1− j and j = 1, . . . , n. If 0 < y < e 1−n , it can be defined as S −1 (y) = x n . Then we can use the predictions in (11) and (16) to get approximations of the median regression curve and the associated confidence regions. We do that for the simulated record values obtained above from the exponential and Pareto models. Thus we get the estimated curves plotted in Fig. 5 (continuous lines). The dashed lines represent the exact curves (see Sects. 3.3 and 3.4). The green dashed lines represent the EMRC obtained with the MLE for θ . Note that the EMRC obtained from S fit better to the record values. This is due to the bias induced by the fact that S is computed from these record values (the same happen, e.g., in the linear regression models). We cannot expect the same accuracy for new records. Actually, the estimated curves based on the MLE fit better to the exact curves than the ones obtained from the non-parametric method (as expected).
We can use other options to maximize˜ in (26). For example, we could use a better approximations for the integral in that expression than that used in (27).
Another popular option is to restrict the set of possible hazard functions. For example, we might assume only linear functions, that is, h(x) = ax + b for x ≥ 0 and a, b ≥ 0. This is equivalent to consider the MLE method applied to the Linear Hazard Rate (LHR) model withF(x) = exp(−bx − ax 2 /2) for x ≥ 0. This model contains the exponential model (a = 0), some Weibull models (b = 0) and it is equivalent to the normal (Gaussian) model when a > 0 and x → ∞. Thus, from (26), we need to maximize The partial derivatives are and So we have to solve the equations and n i=1 1 We can use numerical methods to solve these equations. For example, we could apply the bivariate Newton-Raphson (NR) method (see, e.g., Gil et al. 2007) by using (31)  Fig. 6 Plots of the paired records (R n , R n+1 ) from a model with hazard rate h(x) = 2x + 3 jointly with the exact (left) and NR predictions (right) of the median regression curve (black) and the 50% and 90% confidence bands (blue, red) and (32). Thus, we choose an initial point (a 0 , b 0 ) (e.g. a 0 = b 0 = 1) and then we compute the next points as is the vector with the first partial derivatives, H (a m , b m ) = (∂ i, j˜ (a m , b m )) is the Hessian matrix with the second partial derivatives and H −1 (a m , b m ) is its inverse matrix. We apply this method to the first eleven records from a model with hazard rate h(x) = 2x + 3. After six iterations (m = 6) we obtain the MLE predictions of a = 2 and b = 3,â = 0.8406102 andb = 4.7184054 with a maximum value of (â,b) = 7.947086. These estimations are not very good. However, the EMRC and the EQCB obtained from them, plotted in Fig. 6 (right), are very similar to the exact ones (given in the left plot).

A case with a real data set
We consider the data set studied in Awad and Raqab (2000), Sect. 5. It contains sizes of rocks that should be crushed. The data were published in Dunsmore (1983). The sizes are 9.3, 0.6, 24.4, 18.1, 6.6, 9.0, 14.3, 6.6, 13, 2.4, 5.6, 33.8.  Fig. 7 Plots of the paired records (R n , R n+k ) for k = 1 (left) and k = 2 (right) from the real data set in Dunsmore (1983) jointly with the predictions of the median and mean regression curves (black and purple) and the 50% and 90% confidence bands (blue, red) under an exponential model Hence, the upper record values are R 1 = 9.3, R 2 = 24.4 and R 3 = 33.88. The purpose is to predict R 4 , R 5 , . . . In Awad and Raqab (2000), it is assumed that the data come from an exponential distribution with an unknown parameter θ . Then, as we have seen before, the MLE for θ isθ n = n/R n , obtaining the following predictionŝ θ 1 = 1/R 1 = 0.10752688,θ 2 = 2/R 2 = 0.08196721,θ 3 = 3/R 3 = 0.08854782.
The stability in that predictions may confirm that the exponential model is correct. The estimation with the complete sample isθ = 0.08350731. Then, to predict R 4 we can use the curves given in (21) and (22) with the estimationθ 3 = 3/R 3 = 0.08854782. They are plotted in Fig. 7 (left, black and purple lines) jointly with the corresponding 50% (blue) and 90% (red) confidence bands for these predictions. We also plot the paired record sequence (R 1 , R 2 ) and (R 2 , R 3 ). As we can see they belong to the 50% confidence bands. Therefore, the exponential model cannot be rejected with these data.

A case study in reliability
In this section we show a simulated case study in reliability. We assume that in a production process, every day, a random chosen unit is put into operation (under a normal test or an accelerated test procedure). Let X 1 , X 2 , . . . be the IID lifetimes of these units (think, e.g., on lifetimes of batteries in mobile phones or cars). After c days, only some of these values are available in a complete form, that is, are not censored. For example, after c = 3 days, X 1 is known if and only if X 1 < 3. In general, after c days, X j is available if and only if X j < c − j + 1 (or X j + j − 1 < c). In other words, X j is known the k j -th day, with k j = [X j + j], where [x] represents the integer part of x. The procedure described below based on these assumptions can be adapted to other (similar) sampling schemes. The key idea is to use lower record values because they will be available before and with a small proportion of censored data. We want to use this process to study early failures. Therefore, we can assume that the units do not have aging and that their lifetimes follow an exponential distribution F with an unknown mean μ > 0 and constant hazard rate h(t) = θ = 1/μ for t ≥ 0. The purpose is to estimate θ (or μ) and the lower record values. Here it is important to discriminate small values from F and outliers (i.e. small values not obtained from F). To illustrate this process, we simulate 100 data. The 36 available data lifetimes after c = 75 days are given in Table 5 (in order of availability). The other data are censored. The available lower record values in this sequence are in bold case.
The incomplete (censored) sequence of lower records obtained from this table is R * 1 = 75 + , R * 2 = 21.59455, R * 3 = 3.848409 and R * 4 = 2.367513. The complete sequence of lower records from these 100 data points is 89.21383, 21.59455, 3.848409, 2.367513. To get the exact value of X 1 we have to wait until the 90-th day. However, our censored sequence at day 75 is a good approximation of the exact sequence of lower records. To get the complete sample we have to wait until the 467-th day where X 15 = 452.5437 will be available.
Instead, we want just to use the information available the 75-th day (given in Table 5). The likelihood function for the lower record values (R * 1 , . . . , R * n ) in the exponential model is (θ ) =h(x 1 ) . . .h(x n )F(x n ) = θ n e −θ x n n−1 j=1 e −θ x j 1 − e −θ x j , whereh = f /F is the reversed hazard rate. As a consequence, the joint distribution of (R * 1 , . . . , R * n ) can also be written as a distorted distribution G (x 1 , . . . , x n ) = D (F(x 1 ), . . . , F(x n )) for x 1 > · · · > x n , where the PDF d of D is equal to the one in (6). Actually, D coincides with the distortion functionD in (4). To estimate θ from our censored sample with the MLE, we maximize obtaininĝ θ = 0.01645 (orμ = 60.77). If we wait until the 90-th day to get the complete lower record sample, then the MLE prediction isθ = 0.01437 (orμ = 69.55).
It is plotted in Fig. 8 by replacing θ with the predictions obtained above from the censored (left) and uncensored (right) samples. We also include in the plots the sequences of paired lower records and the centered 50% confidence band [F −1 (0.25F(x n )), F −1 (0.75F(x n ))] and 90% confidence band [F −1 (0.05F(x n )), F −1 (0.95F(x n ))]. The respective predictions for the next lower record R * 5 are 1.1722 and 1.17.36 (they are also included in the plots). We also plot the exact curves (dashed lines) and thus we can see that the predictions are good (in both cases).
To predict the next lower record we can also use the mean regression curve obtained from (35) Fig. 8 Plots of the paired lower records (R * n , R * n+1 ) for the censored (left) and uncensored (right) samples from the data in Table 5 jointly with the predictions of the median and mean regression curves (black and purple) and the 50% and 90% confidence bands (blue, red) under an exponential model. The dashed lines represent the exact curves (obtained with θ = 0.01) for x ≥ 0. It is also plotted in Fig. 8 (purple lines) for the different estimations of θ . The predictions from this curve for R * 5 are 1.176064 (censored sample, day 75) and 1.177034 (complete record sample, day 90).
The available data in Table 5 can also be used to get other lower record sequences. For example, starting with R * 1 = X 4 = 69.04814, we get the sequence R * 2 = X 7 = 45.24274, R * 3 = X 8 = 3.848409 and R * 4 = X 23 = 2.367513. With this sequence, the MLE for θ givesθ = 0.0144 (orμ = 69.515). These predictions can be used to improve the estimated curves plotted above.

Conclusions
We have proposed a very general method to predict record values from preceding records. We focused on upper records but the results for lower records are completely analogous (see Sect. 5). These predictions are based on conditional quantiles which can also be used to provide confidence bands. We first obtain the theoretical expressions and we then show how to apply them in practice in three different cases: a given probability (distribution) model, a given parametric model with an unknown parameter and a nonparametric method for arbitrary models. All these cases show that the proposed method is a good way to manage the uncertainty in the prediction of future record values. The confidence bands are more appropriate to this purpose than point estimators.
This paper is just a first step based on the concept of multivariate distorted distribution introduced in Navarro et al. (2020). The present study for records should be completed by considering MLE for other relevant models (we just study here uniform, PHR, exponential and Pareto models). Other estimation procedures can be considered as well (see e.g., Volovskiy and Kamps 2020a, and the references therein). The nonparametric estimations could be improved by using kernel estimators (that provide continuous curves). Moreover, similar techniques can be used to predict other ordered values such as order statistics (see, Navarro et al. 2020), sequential order statistics, generalized order statistics, time series and sequences of non-identically distributed random variables. These topics are left for future research projects.