Skip to main content
Log in

U-statistics Based Tests for Marginal Hazard Rate Orderings of Two Dependent Variables

  • Original Article
  • Published:
Journal of Statistical Theory and Practice Aims and scope Submit manuscript

Abstract

We aim to compare marginal distributions of a bivariate random vector (XY) with reference to their hazard rates, \((h_F(t), h_G(t))\). In many applications, it is likely that the marginal hazard rates are ordered, e.g., \(h_F(t) \le h_G(t)\). We consider two U-statistics based tests for testing equality of the marginal hazards against the alternative that they are ordered. Further, we compare these tests with the existing W and S tests when X and Y are assumed to be independent. The two proposed statistics are also extended to cover situations when the pair (XY) is subjected to independent univariate censoring. We provide extensive simulation studies based on copulae where the power performance of these tests is assessed. The marginal distributions considered are Weibull, linear failure rate, Gompertz and three other families of distributions based on copulae. We apply the tests to two real data examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bagai I, Kochar SC (1986) On tail-ordering and comparison of failure rates. Commun Stat Theory Methods 15(4):1377–1388. https://doi.org/10.1080/03610928608829189

    Article  MathSciNet  Google Scholar 

  2. Cai J, Kim J (2003) Nonparametric quantile estimation with correlated failure time data. Lifetime Data Anal 9:357–371

    Article  MathSciNet  Google Scholar 

  3. Callaert H, Veraverbeke N (1981) The order of the normal approximation for a studentized U-Statistic. Ann Stat 9(1):194–200

    Article  MathSciNet  Google Scholar 

  4. Cheng KF (1985) Tests for the equality of failure rates. Biometrika 72(1):211–215

    Article  MathSciNet  Google Scholar 

  5. Chikkagoudar MS, Shuster JS (1974) Comparison of failure rates using rank tests. J Am Stat Assoc 69:411–413

    Article  MathSciNet  Google Scholar 

  6. Frees EW, Carriere JF, Valdez EA (1996) Annuity valuation with dependent mortality. J Risk Insur 63(2):229–261

    Article  Google Scholar 

  7. Hofert M, Kojadinovic I, Maechler M, Yan J (2023) copula: multivariate Dependence with copulas. R package version 1.1-2, https://CRAN.R-project.org/package=copula

  8. Hoeffding W (1948) A class of statistics with asymptotically normal distribution. Ann Math Stat 19:293–325

    Article  MathSciNet  Google Scholar 

  9. Huang H, Zhao Y (2018) Empirical likelihood for the bivariate survival function under univariate censoring. J Stat Plan Inference 194:32–46

    Article  MathSciNet  Google Scholar 

  10. Huster WJ, Brookmeyer R, Self SG (1989) Modelling paired survival data with covariates. Biometrics 45(1):145–156

    Article  MathSciNet  Google Scholar 

  11. Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data, 2nd edn. Wiley, New York

    Book  Google Scholar 

  12. Kochar SC (1978) Distribution-free comparisons of two probability distributions with reference to their hazard rates. PhD thesis

  13. Kochar SC (1979) Distribution-free comparison of two probability distributions with reference to their hazard rates. Biometrika 66(3):437–41

    Article  MathSciNet  Google Scholar 

  14. Kochar SC (1981) A new distribution-free test for the equality of two failure rates. Biometrika 68(2):423–426

    Article  MathSciNet  Google Scholar 

  15. Kulathinal S, Dewan I (2023) Weighted U-statistics for likelihood-ratio ordering of bivariate data. Stat Pap 64:705–735. https://doi.org/10.1007/s00362-022-01332-w

    Article  MathSciNet  Google Scholar 

  16. Le CT, Lindgren BR (1996) Duration of ventilating tubes: a test for comparing two clustered samples of censored data. Biometrics 52(1):328–334

    Article  Google Scholar 

  17. Lee AJ (2020) U-statistics: theory and practice. CRC Press, Cambridge

    Google Scholar 

  18. Lehmann EL (1951) Consistency and unbiasedness of certain nonparametric tests. Ann Math Stat 22:165–179

    Article  MathSciNet  Google Scholar 

  19. Lai CD, Xie M (2006) Stochastic ageing and dependence for reliability. Springer, New York

    Google Scholar 

  20. Li R, Cheng Y, Chen Q, Fine J (2017) Quantile association for bivariate survival data. Biometrics 73:506–516. https://doi.org/10.1111/biom.12584

    Article  MathSciNet  Google Scholar 

  21. Luciano E, Spreeuw J, Vigna E (2008) Modelling stochastic mortality for dependent lives. Insur Math Econ 43:234–244

    Article  MathSciNet  Google Scholar 

  22. Nelsen RB (1999) An introduction to copulas. Springer, New York

    Book  Google Scholar 

  23. R Core Team (2022) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna

    Google Scholar 

  24. Shaked M, Shanthikumar JG (2007) Stochastic orders. Springer, New York

    Book  Google Scholar 

  25. Therneau T (2023) A package for survival analysis in R. https://cran.r-project.org/web/packages/survival/index.html

Download references

Acknowledgements

We wish to thank the Society of Actuaries, through the courtesy of Edward (Jed) Frees and Emiliano A. Valdez, for providing the Canadian data used in this paper. The authors would like to thank the referees and the editor for their constructive comments that have led to the improved version of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leena Kulkarni.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

LK’s work was carried out while she was a visiting scientist at the Indian Statistical Institute, New Delhi, India.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 135 KB)

Appendices

Symmetric Kernel and Asymptotic Distribution of \(U_{fc}\) (10)

The symmetric version of the kernel (9) is obtained by averaging over 6 combinations, as defined below.

$$\begin{aligned} \phi _{fc}(\widetilde{Z}_1, \widetilde{Z}_2, \widetilde{Z}_3) = \left\{ \begin{array}{ l l } 1/6, &{} (\delta _1^x = 1, \delta _1^y = 1, U_1< T_1, U_1< U_2, T_1< T_3) \\ \\ &{} \text {or} \; (\delta _1^x = 1, \delta _1^y = 1, U_1< T_1, U_1< U_3, T_1< T_2) \\ \\ &{} \text {or} \; (\delta _2^x = 1, \delta _2^y = 1, U_2< T_2, U_2< U_1, T_2< T_3) \\ \\ &{} \text {or} \; (\delta _2^x = 1, \delta _2^y = 1, U_2< T_2, U_2< U_3, T_2< T_1) \\ \\ &{} \text {or} \; (\delta _3^x = 1, \delta _3^y = 1, U_3< T_3, U_3< U_1, T_3< T_2) \\ \\ &{} \text {or} \; (\delta _3^x = 1, \delta _3^y = 1, U_3< T_3, U_3< U_2, T_3< T_1) \\ \\ -1/6, &{} (\delta _1^x = 1, \delta _1^y = 1, U_1< T_1, U_1< T_3, T_1< U_2) \\ \\ &{} \text {or} \; (\delta _1^x = 1, \delta _1^y = 1, U_1< T_1, U_1< T_2, T_1< U_3) \\ \\ &{} \text {or} \; (\delta _2^x = 1, \delta _2^y = 1, U_2< T_2, U_2< T_1, T_2< U_3) \\ \\ &{} \text {or} \; (\delta _2^x = 1, \delta _2^y = 1, U_2< T_2, U_2< T_3, T_2< U_1) \\ \\ &{} \text {or} \; (\delta _3^x = 1, \delta _3^y = 1, U_3< T_3, U_3< T_1, T_3< U_2) \\ \\ &{} \text {or} \; (\delta _3^x = 1, \delta _3^y = 1, U_3< T_3, U_3< T_2, T_3< U_1) \\ \\ 0, &{}{} \; \text {otherwise}.\\ \\ \end{array} \right. \end{aligned}$$
(18)

In order to obtain the asymptotic variance of \(U_{fc}\), we first define and derive \(\psi _{fc}(z_i)\) as follows.

$$\begin{aligned} \psi _{fc}(z_i)=E[\phi _{fc}(\widetilde{Z}_i,\widetilde{Z}_j, \widetilde{Z}_k)| \widetilde{Z}_i = z_i] \end{aligned}$$

Then

$$\begin{aligned} E(\psi _{fc}(Z_i)) = E(U_{fc}), \end{aligned}$$

and

$$\begin{aligned} \sigma _{1c}^2 = Var(\psi _{fc}(Z_i)) = E[\psi _{fc}(Z_i)]^2 - E^2[\psi _{fc}(Z_i)] \end{aligned}$$

If \(E[\psi _{fc}(Z_i)]^2 < \infty \) and \(\sigma _{1c}^2>0,\) then \(\sqrt{n} (U_{fc}-E(U_{fc})) \xrightarrow {d} N(0, 9\sigma _{1c}^2) \ as \ n \rightarrow \infty .\) Under the null hypothesis, \(E(U_{fc}) = 0.\)

Fig. 1
figure 1

Otitis media data. Ratio of the Kaplan-Meier estimates of the survival functions of survival times of the surgically inserted ventilating tubes in the left ear (X) to that in the right ear (Y). Black and red colour show the estimated ratios for the control and the treatment groups, respectively

Fig. 2
figure 2

Insurance data. Ratio of estimates of survival functions of ages of a couple at contract initiation for the entire data (ALL) and a randomly selected sample of size 500 (Sample), for \(20\%\) censoring imposed on the complete data and on the selected sample (ALL 20% and Sample 20%)

Theorem 2

When \(\sigma _{1c}^2 = Var[\Psi _{fc}(Z_i)] >0\), \(\sqrt{n}(U_{fc} - E(U_{fc})) \xrightarrow {d} N(0,9\sigma _{1c}^2) \ as \ n \rightarrow \infty \). Also, \(E(U_{fc}) = 0\) under \(H_{0f}.\)

W and S Tests

We review tests W and S (Kochar [13, 14]), and provide computational details here. Let \(X_1, X_2,...X_n\) and \(Y_1, Y_2,..., Y_m\) be independent random samples from the two distributions F and, G respectively. Let \(X_{(1)}, X_{(2)},...,X_{(n)}\) be the order statistics of the X-sample and \(Y_{(1)}, Y_{(2)},...,Y_{(m)}\) be the order statistics of the Y-sample. The U-statistic W in terms of rank representation, is defined as

$$\begin{aligned} W ={} & {} \sum _{i=1}^{n} (n-i) {{R_{(i)}-i}\atopwithdelims (){2}} - \sum _{j=1}^{m} (m-j) {{S_{(j)}-j}\atopwithdelims (){2}} + \\{} & {} \sum _{j=1}^{m} (s-S_{(j)}+j) (j-1) (S_{(j)}-j) - \sum _{i=1}^{n} (R_{(i)}-i) (m-R_{(i)}+i) (n-i), \end{aligned}$$

where \(R_{(i)}, i= 1,2,...,n\) and \(S_{(j)},j=1,2,...,m\) are the ranks of \(X_{(i)}\) and \(Y_{(j)},\) respectively, in the combined sample of X and Y observations. Detailed proof of the above representation is available in Kochar [12].

To compare the performance of the proposed tests with the W test, we take \(n=m.\) Hence, under \(H_0, E(W)=0,\) and

$$\begin{aligned} V(W)= \frac{1}{210 \left( {\begin{array}{c}n\\ 2\end{array}}\right) ^2} \Big [32 n^3-28n^2-6n+8\Big ]. \end{aligned}$$

Asymptotically as \(n \rightarrow \infty \), \(\sqrt{2n} ~ W \sim N(0, 128/105).\)

Let \(Y_{(1)}, Y_{(2)},...,Y_{(m)}\) be the order statistics of the Y-sample, and let \(\tilde{S}_{(j)}\) denote the rank of the \(Y_{(j)}\) in the combined increasing arrangement of \(X's\) and \(Y's\). Then the S-statistic can be expressed as

$$\begin{aligned} S=\frac{1}{nm} \Big (\sum _{j=1}^{m}a_j \tilde{S}_{(j)}- \sum _{j=1}^{m} j a_j \Big ), \end{aligned}$$

where \(a_j = \frac{1}{2}+log\{1-j/(m+1)\}.\) Under \(H_0\) and when \(n=m\),

$$\begin{aligned} E(S) = \frac{1}{n(n+1)}\sum _{j=1}^{n} j a_j, \;\; V(S) = \frac{(2n+1)}{n^3(n+2)(n+1)^2} ~ a' ~ \Omega ~ a, \end{aligned}$$

where the matrix \(\Omega =((w_{ij})) (i,j = 1,2,...,n),\) has \(w_{ij}=i(m+1-j)=w_{ji}\) for \(i\le j\) and \(a' = (a_1, a_2,...,a_n).\) Then, asymptotically, as \(n \rightarrow \infty \), \((S-E(S))/\sqrt{V(S)} \sim N(0,1).\)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kulkarni, L., Kulathinal, S. & Dewan, I. U-statistics Based Tests for Marginal Hazard Rate Orderings of Two Dependent Variables. J Stat Theory Pract 18, 19 (2024). https://doi.org/10.1007/s42519-024-00372-9

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42519-024-00372-9

Keywords

Navigation