In this section we provide some examples to illustrate the theoretical findings described in previous sections. In the first one we consider the sum of two dependent random variables satisfying the GK-model proposed in Genest and Kolev (2021), i.e., the model (2.8).
Example 1
Let us assume that \((X_1,X_2)\) satisfies (2.8) for \(\alpha \ne \beta \) and \({\bar{G}}(x)=(1+x)^{-\gamma }\) for \(x\ge 0\) (Pareto type II survival function) and \(\gamma >0\). This model is equivalent to consider an Archimedean Clayton survival copula with \(\theta =1/\gamma \) [see (4.2.1) in Nelson (Nelsen (2006), p. 116)] and Pareto type II marginals. Then, from (3.3), the joint distribution function of \((X_1,S)\) is
$$\begin{aligned} {\mathbf {G}}(x,s)&= G(\alpha x)-\frac{\alpha }{\alpha -\beta } G ((\alpha -\beta )x+\beta s )+ \frac{\alpha }{\alpha -\beta } G(\beta s )\\&=1-\left( 1+\alpha x\right) ^{-\gamma }+\frac{\alpha }{\alpha -\beta }\left( 1+(\alpha -\beta )x+\beta s\right) ^{-\gamma } -\frac{\alpha }{\alpha -\beta } \left( 1+\beta s\right) ^{-\gamma } \end{aligned}$$
for \(0\le x\le s\). Hence, the distribution function \(F_S\) of S (i.e., the C-convolution) is
$$\begin{aligned} F_S(s)={\mathbf {G}}(s,s)=1+\frac{\beta }{\alpha -\beta } (1+\alpha s)^{-\gamma } -\frac{\alpha }{\alpha -\beta } (1+\beta s)^{-\gamma } \end{aligned}$$
for \(s\ge 0\). Its PDF is
$$\begin{aligned} f_S(s)=\frac{\alpha \beta \gamma }{\alpha -\beta } (1+\beta s)^{-\gamma -1}-\frac{\alpha \beta \gamma }{\alpha -\beta } (1+\alpha s)^{-\gamma -1} \end{aligned}$$
for \(s\ge 0\). The distribution of S is a negative mixture of two Pareto type II distributions; thus, its hazard rate goes to zero when \(s\rightarrow \infty \) (which is the limit of the hazard rates of the members of the C-convolution). They are plotted in Fig. 1 (right) jointly with the associated PDF functions (left) for \(\gamma =\alpha =2\) and \(\beta =1\). Note that the hazard rates of \(X_1\) and \(X_2\) are decreasing while the one of S is not monotone, showing that the increasing failure rate (IFR) class is not preserved by the sum of dependent random variables. Some preservation properties can be seen in Navarro and Pellerey (2021).
If we want to predict \(X_1\) from \(S=s\), we need the conditional distribution obtained from (4.8) as
$$\begin{aligned} F_{X_1\mid S}(x\mid s)&=\frac{g((\alpha -\beta )x+\beta s)-g(\beta s)}{ g(\alpha s)-g(\beta s)}\\&=\frac{(1+(\alpha -\beta )x+\beta s)^{-\gamma -1}-(1+\beta s)^{-\gamma -1}}{ (1+\alpha s)^{-\gamma -1}-(1+\beta s)^{-\gamma -1}} \end{aligned}$$
for \(0\le x\le s\). Its inverse function is then
$$\begin{aligned} F^{-1}_{X_1\mid S}(q\mid s)= \frac{-1-\beta s+ \left( q(1+\alpha s)^{-\gamma -1}+(1-q) (1+\beta s)^{-\gamma -1}\right) ^{-1/(\gamma +1)}}{\alpha -\beta } \end{aligned}$$
for \(0<q<1\). The median regression curve is obtained by replacing q with 1/2. It is plotted in Fig. 2, jointly with a sample from \((X_1,S)\) and the associated 50% and 90% centered confidence bands. We also include there the parametric (top) and nonparametric (bottom) estimations for these curves (dashed lines). Here, nonparametric means that we use the linear quantile regression procedure in R (see Koenker (2005); Koenker and Bassett (1978)).
To estimate the parameters in the model from the sample we use the Kendall’s tau coefficient of the pair \((X_1,X_2)\), which is given by
$$\begin{aligned} \tau =\frac{\theta }{2+\theta }=\frac{1}{1+2\gamma } \end{aligned}$$
[see Nelson (2006, p. 163)]. Therefore, \(\gamma \) can estimated by
$$\begin{aligned} {\widehat{\gamma }}= \frac{1-{\widehat{\tau }}}{2{\widehat{\tau }}}=\frac{1-0.158}{2\cdot 0.158}=2.664557, \end{aligned}$$
where \({\widehat{\tau }}\) is the estimator of the Kendall’s tau. Then, to estimate \(\alpha \) and \(\beta \), we recall that \({\mathbb {E}}(X_1)=1/(\alpha (\gamma -1))\) and \({\mathbb {E}}(X_2)=1/(\beta (\gamma -1))\), obtaining
$$\begin{aligned} {\widehat{\alpha }}=\frac{1}{({\widehat{\gamma }}-1 ){\bar{X}}_1 }=\frac{1}{1.664557\cdot 0.3880776}=1.548042 \end{aligned}$$
and
$$\begin{aligned} {\widehat{\beta }}=\frac{1}{({\widehat{\gamma }}-1 ){\bar{X}}_2 }=\frac{1}{1.664557\cdot 0.8674393}=0.6925677. \end{aligned}$$
For the nonparametric linear estimators of the quantile regression curves, we use the R library quantreg (see Koenker (2005); Koenker and Bassett (1978); Navarro (2020)). The estimated median regression line to estimate \(X_1\) from S obtained from our sample is
$$\begin{aligned} {\widehat{m}}_{X_1\mid S}(s)=0.09752378 +0.17721635 s. \end{aligned}$$
The procedure to predict S from \(X_1\) is analogous.
In the second example we consider the more general TTE dependence model; in this case we show how to predict S from \(X_1\).
Example 2
Let \((X_1,X_2)\) have joint survival function defined as in (2.5), where \({\widehat{D}}\) is given in (2.7). Thus, we can use the expressions obtained in Sect. 4, (4.1) and (4.2), to predict S from \(X_1\).
For example, we can choose
$$\begin{aligned} {\bar{G}}(x)={\bar{H}}_1(x)={\bar{H}}_2(x)=c\ (1-\Phi (1+x))=c\ \Phi (-1-x)) \end{aligned}$$
for \(x\ge 0\), where \(\Phi \) is the standard normal distribution and \(c=1/\Phi (-1)=6.302974\) (i.e., G is a truncated normal distribution). Hence, \(g(x)=c\ \phi (1+x)\) where \(\phi =\Phi '\) is the PDF of a standard normal distribution. Note that, in this case, the corresponding Archimedean copula (that we could call Gaussian Archimedean copula) does not have an explicit expression (since it depends on \({\bar{G}}\) and on \({\bar{G}}^{-1}\)). Thus, this is a practical example where the distortion representation can be used as a proper alternative.
Under the previous assumptions, the inverse functions are
$$\begin{aligned} {\bar{G}}^{-1}(x)=-1-\Phi ^{-1}\left( \frac{x}{c}\right) \end{aligned}$$
and
$$\begin{aligned} g^{-1}(x)=-1+\left( 2\ln c-\ln (2\pi )-2\ln x \right) ^{1/2}. \end{aligned}$$
By using these expressions we compute \({{\bar{F}}}^{-1}_{S\mid X_1}\) as in (4.2), obtaining the quantile regression curve plotted in Fig. 3 (top). The same figure also includes a sample of \(n=100\) points from \((X_1,S)\) and the exact centered 50% and 90% (blue) confidence bands. Moreover, it shows the plot of the nonparametric linear quantile estimate (dashed lines) obtained from this sample.
As we know that \(X_1<S\), we could also provide bottom 50% and 90% confidence bands obtained as \(\left[ x,{{\bar{F}}}^{-1}_{S\mid X_1}(0.5\mid x)\right] \) and \(\left[ x,{{\bar{F}}}^{-1}_{S\mid X_1}(0.1\mid x)\right] \), respectively. They are plotted in Fig. 3 (bottom). In this case, the median regression curve is also the upper limit for the 50% confidence band. In our sample we obtain 10 data above the upper (exact) limit and 46 above the median regression curve (i.e., 54 data in the exact bottom 50% confidence band). The estimated median regression line obtained from our sample is
$$\begin{aligned} {\widehat{m}}_{S\mid X_1}(x)=0.3159734 +0.7284655x \end{aligned}$$
for \(x\ge 0\).
In the next example we show a case of model (2.8) that cannot be represented with an explicit Archimedean copula, thus for which the distortion representations consists in a useful alternative tool. In fact, in this example \({\bar{G}}\) is convex and an explicit expression for its inverse is not available. For this model we compute the explicit expressions for the C-convolution and the two conditional survival functions.
Example 3
Let us consider (2.8) with \(\alpha \ne \beta \) and the survival function
$$\begin{aligned} {\bar{G}}(x)=\frac{2+x}{2} e^{-x} \end{aligned}$$
for \(x\ge 0\). Its PDF is
$$\begin{aligned} g(x)=\frac{1+x}{2} e^{-x} \end{aligned}$$
for \(x\ge 0\), that is, a translated Gamma (Erlang) distribution. The joint survival function of \((X_1,X_2)\) is
$$\begin{aligned} {\bar{\mathbf {F}}}(x_1,x_2)={\bar{G}}(\alpha x_1+\beta x_2)= \frac{2+\alpha x_1+\beta x_2}{2} \exp (-\alpha x_1-\beta x_2) \end{aligned}$$
for \(x_1,x_2\ge 0\). The marginals also follow translated Gamma distributions.
The joint distribution of \((X_1,S)\) can be obtained from (3.3). From this expression, the survival function of S (C-convolution) is
$$\begin{aligned} {\bar{F}}_S(s)&=\frac{\alpha }{\alpha -\beta } {\bar{G}}(\beta s )- \frac{\beta }{\alpha -\beta } {\bar{G}} (\alpha s )\\&=\frac{\alpha }{\alpha -\beta }e^{-\beta s}- \frac{\beta }{\alpha -\beta } e^{-\alpha s} + \frac{\alpha \beta s}{2(\alpha -\beta )}\left( e^{-\beta s}-e^{-\alpha s}\right) \end{aligned}$$
for \(s\ge 0\). Note that it is a negative mixture of two translated Gamma distributions.
The conditional survival function of \((S\mid X_1=x)\) can be obtained from (4.3) as
$$\begin{aligned} {\bar{F}}_{S\mid X_1}(s\mid x)=\frac{g((\alpha -\beta )x+\beta s)}{g(\alpha x)}=\frac{1+ (\alpha -\beta )x+\beta s}{1+\alpha x}e^{-\beta (s-x)} \end{aligned}$$
for \(s\ge x\). Analogously, from (4.8), the conditional survival function of \((X_1\mid S=s)\) is
$$\begin{aligned} {\bar{F}}_{X_1\mid S}(x\mid s)= & {} \frac{g(\alpha s)-g((\alpha -\beta )x+\beta s)}{g(\alpha s)-g(\beta s)}\nonumber \\= & {} \frac{1+\alpha s -(1+(\alpha -\beta )x+\beta s)e^{(\alpha -\beta )(s-x)}}{1+\alpha s-(1+\beta s)e^{(\alpha -\beta )s}} \end{aligned}$$
for \(0\le x\le s\).
In Fig. 4 we plot the probability density (left) and hazard rate (right) functions of \(X_1\) (red), \(X_2\) (green) and S (blue) when \(\alpha =2\) and \(\beta =1\). Note that both marginals are IFR and the same holds for S. Also note that the limiting behavior of the hazard rate of S coincides with that of the best component in the sum (\(X_2\)). This is according with the results on mixtures obtained in Lemma 3.3 of Navarro and Shaked (2006) (or Lemma 4.6 in Navarro and Sarabia (2020)) and that in Theorem 1 of Block et al. (2015) on usual convolutions.
In the last example we show a case dealing with the GK model (2.8) where the inverse of the conditional distribution function \(F_{X_1\mid S}\) of \((X_1\mid S)\) cannot be obtained in a closed form. Then we need to use numerical methods (or implicit function plots). Moreover, it also shows that the quantile (median) regression curve \(m_{X_1\mid S}(s)=F^{-1}_{X_1\mid S}(0.5\mid s)\) is not always increasing.
Example 4
Let us consider the model (2.8) with a survival copula in the family of Gumbel–Barnett copulas [see (4.2.9) in Nelson (2006, p. 116]. In this case, the additive generator of the copula is \({\bar{G}}^{-1}(x)=\ln (1-\theta \ln x)\) for \(x\in (0,1]\) and \(\theta \in (0,1]\). These copulas are strict Archimedean copulas and the independence (product) copula is obtained for \(\theta \rightarrow 0\). Hence,
$$\begin{aligned} {\bar{G}}(x)= \exp \left( \frac{1}{\theta }- \frac{1}{\theta }e^x \right) \end{aligned}$$
and
$$\begin{aligned} g(x)=\frac{1}{\theta }\exp \left( x+\frac{1}{\theta }- \frac{1}{\theta }e^x \right) \end{aligned}$$
for \(x\ge 0\). Note that the inverse of g does not have an explicit form, thus one cannot use (4.9) to compute the quantile functions of \((X_1\mid S)\). The same happens in (4.4) for the quantile functions of \((S\mid X_1)\).
However, it is possible to plot the level curves of the conditional distribution function by using (4.8), obtaining
$$\begin{aligned} {F}_{X_1\mid S}(x\mid s)=\frac{g((\alpha -\beta )x+\beta s)-g(\beta s)}{ g(\alpha s)-g(\beta s)} \end{aligned}$$
(5.1)
when \(\alpha \ne \beta \). For example, if we choose \(\alpha =3\), \(\beta =1\) and \(\theta =1\) in (5.1), we get
$$\begin{aligned} {F}_{X_1\mid S}(x\mid s)=\frac{g(2x+ s)-g(s)}{ g(3s)-g(s)} =\frac{\exp \left( 2x+s+ 1 - e^{2x+s} \right) -\exp \left( s+ 1 - e^{s} \right) }{\exp \left( 3s+ 1 - e^{3s} \right) -\exp \left( s+ 1 - e^{s} \right) } \end{aligned}$$
for \(0\le x\le s\). These level curves are plotted in Fig. 5 (left) for \(q=0.05,0.25,0.5,0.75\),0.95. Note that the median regression curve \(m_{X_1\mid S}(s)=F^{-1}_{X_1\mid S}(0.5\mid s)\) (red line, left) is first increasing and then decreasing. To explain this surprising fact we plot \({F}_{X_1\mid S}(x\mid s)\) in Fig. 5 (right) for different values of s, where one can observe that these distribution functions are not ordered in s, that is, \((X_1\mid S=s)\) is not stochastically increasing in s. Here the greater values for \(X_1\) are obtained when \(S\approx 0.6\) (green line). Also note that \({\mathbb {E}}(X_2)=3{\mathbb {E}}(X_1)\) and that \(X_1\) and \(X_2\) are negatively correlated. Therefore, the greater values of S are mainly obtained from the greater values of \(X_2\) and the smaller values of \(X_1\). For that reasons \(m_{X_1\mid S}\) is decreasing at the end.
Also note that
$$\begin{aligned} {\mathbb {C}}ov(X_1,S)={\mathbb {V}}ar(X_1)+{\mathbb {C}}ov(X_1,X_2)={\mathbb {V}}ar(X_1)+{\mathbb {E}}(X_1X_2)-{\mathbb {E}}(X_1){\mathbb {E}}(X_2). \end{aligned}$$
Therefore, \({\mathbb {C}}ov(X_1,S)\ge 0\) when \({\mathbb {C}}ov(X_1,X_2)\ge 0\) and, in particular, when \(X_1\) and \(X_2\) are independent. However, the covariance \({\mathbb {C}}ov(X_1,S)\) will be negative if \({\mathbb {V}}ar(X_1)<-{\mathbb {C}}ov(X_1,X_2)\). In our case, the marginal reliability functions of \(X_1\) and \(X_2\) are \({\bar{F}}_1(t)={\bar{G}}(3t)\) and \({\bar{F}}_2(t)={\bar{G}}(t)\), respectively. Their means are \({\mathbb {E}}(X_1)=0.198782\) and \({\mathbb {E}}(X_2)=0.596347\), their variances \({\mathbb {V}}ar(X_1)=0.019589\) and \({\mathbb {V}}ar(X_2)=0.176301\) and their covariance \({\mathbb {C}}ov(X_1,X_2)=-0.029889\). Hence
$$\begin{aligned} {\mathbb {C}}ov(X_1,S)={\mathbb {V}}ar(X_1)+{\mathbb {C}}ov(X_1,X_2)=0.019589-0.029889=-0.010299<0. \end{aligned}$$