Statistics and Computing

, Volume 20, Issue 2, pp 165–176 | Cite as

Rank-based variable selection with censored data

  • Jinfeng Xu
  • Chenlei Leng
  • Zhiliang Ying


A rank-based variable selection procedure is developed for the semiparametric accelerated failure time model with censored observations where the penalized likelihood (partial likelihood) method is not directly applicable.

The new method penalizes the rank-based Gehan-type loss function with the 1 penalty. To correctly choose the tuning parameters, a novel likelihood-based χ 2-type criterion is proposed. Desirable properties of the estimator such as the oracle properties are established through the local quadratic expansion of the Gehan loss function.

In particular, our method can be easily implemented by the standard linear programming packages and hence numerically convenient. Extensions to marginal models for multivariate failure time are also considered. The performance of the new procedure is assessed through extensive simulation studies and illustrated with two real examples.


Accelerated failure time model Adaptive Lasso BIC Gehan-type loss function Lasso Variable selection 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Cai, T., Huang, J., Lu, T.: Regularized estimation for the accelerated failure time model. Biometrics (2009, to appear) Google Scholar
  2. Cox, D.R.: Regression models and life-tables (with Discussion). J. R. Stat. Soc. B 34, 187–220 (1972) zbMATHGoogle Scholar
  3. Dawber, T.R.: The Framingham Study. The Epidemiology of Atherosclerotic Disease. Harvard University Press, Cambridge (1980) Google Scholar
  4. Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2001) zbMATHCrossRefMathSciNetGoogle Scholar
  5. Fan, J., Li, R.: Variable selection for Cox’s proportional hazards model and frailty model. Ann. Stat. 30, 74–99 (2002) zbMATHCrossRefMathSciNetGoogle Scholar
  6. Gehan, E.A.: A generalized Wilcoxon test for comparing arbitrarily single-censored samples. Biometrika 52, 203–223 (1965) zbMATHMathSciNetGoogle Scholar
  7. Gumbel, E.J.: Bivariate exponential distributions. J. Am. Stat. Assoc. 55, 698–707 (1960) zbMATHCrossRefMathSciNetGoogle Scholar
  8. Jin, Z., Ying, Z., Wei, L.J.: A simple resampling method by perturbing the minimand. Biometrika 88, 381–390 (2001) zbMATHCrossRefMathSciNetGoogle Scholar
  9. Jin, Z., Lin, D.Y., Wei, L.J., Ying, Z.: Rank-based inference for the accelerated failure time model. Biometrika 90, 341–353 (2003) zbMATHCrossRefMathSciNetGoogle Scholar
  10. Jin, Z., Lin, D.Y., Ying, Z.: Rank regression analysis of multivariate failure time data based on marginal linear models. Scand. J. Stat. 33, 1–23 (2006) zbMATHCrossRefMathSciNetGoogle Scholar
  11. Johnson, B.A.: Variable selection in semiparametric linear regression with censored data. J. R. Stat. Soc. Ser. B 70, 351–370 (2008) zbMATHCrossRefGoogle Scholar
  12. Johnson, B.A., Peng, L.M.: Rank-based variable selection. J. Nonparametric Stat. 20, 241–252 (2008) zbMATHCrossRefMathSciNetGoogle Scholar
  13. Johnson, B.A., Lin, D.Y., Zeng, D.: Penalized estimating functions and variable selection in semiparametric regression models. J. Am. Stat. Assoc. 103, 672–680 (2008) zbMATHCrossRefMathSciNetGoogle Scholar
  14. Kalbfleisch, J., Prentice, R.: The Statistical Analysis of Failure Time Data, 2nd edn. Wiley, New York (2002) zbMATHGoogle Scholar
  15. Koenker, R., D’Orey, V.: Computing regression quantiles. Appl. Stat. 36, 383–393 (1987) CrossRefGoogle Scholar
  16. Leeb, H., Pötscher, B.M.: Sparse estimators and the oracle property, or the return of Hodges’ estimator. J. Econom. 142, 201–211 (2008) CrossRefGoogle Scholar
  17. Li, Y., Zhu, J.: L1-norm quantile regression. J. Comput. Graph. Stat. 17, 163–185 (2008) CrossRefMathSciNetGoogle Scholar
  18. Lu, W., Zhang, H.H.: Variable selection for proportional odds model. Stat. Med. 26, 3771–3781 (2007) CrossRefMathSciNetGoogle Scholar
  19. Parzen, M.I., Wei, L.J., Ying, Z.: A resampling method based on pivotal estimating functions. Biometrika 81, 341–350 (1994) zbMATHCrossRefMathSciNetGoogle Scholar
  20. Rao, C.R., Zhao, L.C.: Approximation to the distribution of M-estimates in linear models by randomly weighted bootstrap. Sankhyā A 54, 323–331 (1992) zbMATHMathSciNetGoogle Scholar
  21. Therneau, T.M., Grambsch, P.M.: Introduction to Nonparametric Regression. Springer, New York (2001) Google Scholar
  22. Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. B 58, 267–288 (1996) zbMATHMathSciNetGoogle Scholar
  23. Tibshirani, R.: The Lasso method for variable selection in the cox model. Stat. Med. 16, 385–395 (1997) CrossRefGoogle Scholar
  24. Wang, H., Leng, C.: Unified Lasso estimation via least squares approximation. J. Am. Stat. Assoc. 102(479), 1039–1048 (2007) zbMATHCrossRefMathSciNetGoogle Scholar
  25. Wang, H., Li, G., Jiang, G.: Robust regression shrinkage and consistent variable selection via the LAD-LASSO. J. Bus. Econ. Stat. 25, 347–355 (2007a) CrossRefMathSciNetGoogle Scholar
  26. Wang, H., Li, G., Tsai, C.L.: Regression coefficients and autoregressive order shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B 69, 63–78 (2007b) MathSciNetGoogle Scholar
  27. Wang, H., Li, R., Tsai, C.L.: Tuning parameter selector for SCAD. Biometrika 94, 553–568 (2007c) zbMATHCrossRefMathSciNetGoogle Scholar
  28. Wei, L.J., Ying, Z., Lin, D.Y.: Linear regression analysis for censored observations based on rank tests. Biometrika 77, 845–851 (1990) CrossRefMathSciNetGoogle Scholar
  29. Ying, Z.: A large sample study of rank estimation for censored regression data. Ann. Stat. 21, 76–99 (1993) zbMATHCrossRefGoogle Scholar
  30. Zhang, H.H., Lu, W.: Adaptive Lasso for Cox’s proportional hazards model. Biometrika 94, 691–703 (2007) zbMATHCrossRefMathSciNetGoogle Scholar
  31. Zou, H.: The adaptive Lasso and its oracle properties. J. Am. Stat. Assoc. 101, 1418–1429 (2006) zbMATHCrossRefGoogle Scholar
  32. Zou, H.: A note on path-based variable selection in the penalized proportional hazards model. Biometrika 95, 241–247 (2008) zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Department of Statistics and Applied Probability, Risk Management InstituteNational University of SingaporeSingaporeSingapore
  2. 2.Department of StatisticsColumbia UniversityNew YorkUSA

Personalised recommendations