Skip to main content
Log in

Nonparametric and semiparametric regression estimation for length-biased survival data

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

For the past several decades, nonparametric and semiparametric modeling for conventional right-censored survival data has been investigated intensively under a noninformative censoring mechanism. However, these methods may not be applicable for analyzing right-censored survival data that arise from prevalent cohorts when the failure times are subject to length-biased sampling. This review article is intended to provide a summary of some newly developed methods as well as established methods for analyzing length-biased data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Addona V, Wolfson DB (2006) A formal test for the stationarity of the incidence rate using data from a prevalent cohort study with follow-up. Lifetime Data Anal 12(3):267–284

    Article  MathSciNet  MATH  Google Scholar 

  • Andersen P, Borgan O, Gill R, Keiding N (1993) Statistical models based on counting processes. Springer, New York

    Book  MATH  Google Scholar 

  • Asgharian M, M’Lan CE, Wolfson DB (2002) Length-biased sampling with right censoring: an unconditional approach. J Am Stat Assoc 97(457):201–209

    Article  MathSciNet  MATH  Google Scholar 

  • Asgharian M, Wolfson DB et al (2005) Asymptotic behavior of the unconditional npmle of the length-biased survivor function from right censored prevalent cohort data. Ann Stat 33(5):2109–2131

    Article  MathSciNet  MATH  Google Scholar 

  • Asgharian M, Wolfson DB, Zhang X (2006) Checking stationarity of the incidence rate using prevalent cohort survival data. Stat Med 25(10):1751–1767

    Article  MathSciNet  Google Scholar 

  • Asgharian M, Wolfson C, Wolfson DB (2014) Analysis of biased survival data: the canadian study of health and aging and beyond. In: Statistics in action: a Canadian outlook. CRC Press, pp 193–208

  • Bergeron PJ, Asgharian M, Wolfson DB (2008) Covariate bias induced by length-biased sampling of failure times. J Am Stat Associ 103(482):734–742

    Article  MathSciNet  MATH  Google Scholar 

  • Chan KCG, Wang MC (2012) Estimating incident population distribution from prevalent data. Biometrics 68(2):521–531

    Article  MathSciNet  MATH  Google Scholar 

  • Chan KCG, Chen YQ, Di CZ (2012) Proportional mean residual life model for right-censored length-biased data. Biometrika 99(4):995–1000

    Article  MathSciNet  MATH  Google Scholar 

  • Chen YQ (2010) Semiparametric regression in size-biased sampling. Biometrics 66(1):149–158

    Article  MathSciNet  MATH  Google Scholar 

  • Cheng YJ, Huang CY (2014) Combined estimating equation approaches for semiparametric transformation models with length-biased survival data. Biometrics 70(3):608–618

    Article  MathSciNet  MATH  Google Scholar 

  • Cook RJ, Bergeron PJ (2011) Information in the sample covariate distribution in prevalent cohorts. Stat Med 30(12):1397–1409

    Article  MathSciNet  Google Scholar 

  • Cox DR (1962) Renewal theory. Methuen, London

    MATH  Google Scholar 

  • Cox DR (1972) Regression models and life-tables. J R Stat Soc Ser B (Methodol) 34(2):187–220

  • Davidov O, Fokianos K, Iliopoulos G (2010) Order-restricted semiparametric inference for the power bias model. Biometrics 66(2):549–557

    Article  MathSciNet  MATH  Google Scholar 

  • de Una-Álvarez J, Otero-Giráldez MS, Álvarez-Llorente G (2003) Estimation under length-bias and right-censoring: an application to unemployment duration analysis for married women. J Appl Stat 30(3):283–291

    Article  MathSciNet  MATH  Google Scholar 

  • Ghosh D (2008) Proportional hazards regression for cancer studies. Biometrics 64(1):141–148

    Article  MathSciNet  MATH  Google Scholar 

  • Gill RD, Vardi Y, Wellner JA (1988) Large sample theory of empirical distributions in biased sampling models. Ann Stat 16:1069–1112

    Article  MathSciNet  MATH  Google Scholar 

  • Gross ST, Lai TL (1996) Nonparametric estimation and regression analysis with left-truncated and right-censored data. J Am Stat Assoc 91(435):1166–1180

    Article  MathSciNet  MATH  Google Scholar 

  • Huang CY, Qin J (2012) Composite partial likelihood estimation under length-biased sampling, with application to a prevalent cohort study of dementia. J Am Stat Assoc 107(499):946–957

    Article  MathSciNet  MATH  Google Scholar 

  • Huang CY, Qin J (2013) Semiparametric estimation for the additive hazards model with left-truncated and right-censored data. Biometrika 100(4):877–888

    Article  MathSciNet  MATH  Google Scholar 

  • Huang CY, Qin J, Follmann DA (2012) A maximum pseudo-profile likelihood estimator for the cox model under length-biased sampling. Biometrika 99:199–210

    Article  MathSciNet  MATH  Google Scholar 

  • Kalbfleisch JD, Lawless JF (1989) Inference based on retrospective ascertainment: an analysis of the data on transfusion-related aids. J Am Stat Assoc 84:360–372

    Article  MathSciNet  MATH  Google Scholar 

  • Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data, 2nd edn. Wiley-Interscience, Hoboken

    Book  MATH  Google Scholar 

  • Keiding N (1992) Independent delayed entry. In: Klein JP, Goel P (eds) Survival analysis: state of the art. Kluwer Academic Publishers, Boston, pp 309–326

    Chapter  Google Scholar 

  • Keiding N, Gill RD (1990) Random truncation models and markov processes. Ann Stat 18:582–602

    Article  MathSciNet  MATH  Google Scholar 

  • Keiding N, Kvist K, Hartvig H, Tvede M, Juul S (2002) Estimating time to pregnancy from current durations in a cross-sectional sample. Biostatistics 3:565–578

    Article  MATH  Google Scholar 

  • Keiding N, Fine JP, Hansen OH, Slama R (2011) Accelerated failure time regression for backward recurrence times and current durations. Stat Probab Lett 81(7):724–729

    Article  MathSciNet  MATH  Google Scholar 

  • Kim JP, Lu W, Sit T, Ying Z (2013) A unified approach to semiparametric transformation models under general biased sampling schemes. J Am Stat Assoc 108(501):217–227

    Article  MathSciNet  MATH  Google Scholar 

  • Kvam P (2008) Length bias in the measurements of carbon nanotubes. Technometrics 50(4):462–467

    Article  MathSciNet  Google Scholar 

  • Lagakos SW, Barraj LM, De Gruttola V (1988) Nonparametric analysis of truncated survival data, with application to aids. Biometrika 75:515–523

    Article  MathSciNet  MATH  Google Scholar 

  • Lai TL, Ying Z (1991) Rank regression methods for left-truncated and right-censored data. Ann Stat 19:531–556

    Article  MathSciNet  MATH  Google Scholar 

  • Lin C, Zhou Y (2014) Analyzing right-censored and length-biased data with varying-coefficient transformation model. J Multivar Anal 130:45–63

    Article  MathSciNet  MATH  Google Scholar 

  • Liu H, Qin J, Shen Y (2012) Imputation for semiparametric transformation models with biased-sampling data. Lifetime Data Anal 18(4):470–503

    Article  MathSciNet  MATH  Google Scholar 

  • Liu H, Ning J, Qin J, Shen Y (2016) Semiparametric maximum likelihood inference for truncated or biased-sampling data. Stat Sin. doi:10.5705/ss.2014.094

  • Mandel M, Betensky RA (2007) Testing goodness of fit of a uniform truncation model. Biometrics 63(2):405–412

    Article  MathSciNet  MATH  Google Scholar 

  • Mandel M, Ritov Y (2010) The accelerated failure time model under biased sampling. Biometrics 66(4):1306–1308

    Article  MathSciNet  MATH  Google Scholar 

  • Martin EC, Betensky RA (2005) Testing quasi-independence of failure and truncation times via conditional kendall’s tau. J Am Stat Assoc 100(470):484–492

    Article  MathSciNet  MATH  Google Scholar 

  • Ning J, Qin J, Shen Y (2010) Non-parametric tests for right-censored data with biased sampling. J R Stat Soc 72(5):609–630

    Article  MathSciNet  Google Scholar 

  • Ning J, Qin J, Shen Y (2011) Buckley-james-type estimator with right-censored and length-biased data. Biometrics 67(4):1369–1378

    Article  MathSciNet  MATH  Google Scholar 

  • Ning J, Qin J, Shen Y (2014a) Score estimating equations from embedded likelihood functions under accelerated failure time model. J Am Stat Assoc 109(508):1625–1635

    Article  MathSciNet  Google Scholar 

  • Ning J, Qin J, Shen Y (2014b) Semiparametric accelerated failure time model for length-biased data with application to dementia study. Stat Sin 24(1):313

    MATH  Google Scholar 

  • Nowell C, Stanley LR (1991) Length-biased sampling in mall intercept surveys. J Mark Res 28(4):475–479

    Article  Google Scholar 

  • Qin J, Shen Y (2010) Statistical methods for analyzing right-censored length-biased data under Cox model. Biometrics 66(2):382–392

    Article  MathSciNet  MATH  Google Scholar 

  • Qin J, Ning J, Liu H, Shen Y (2011) Maximum likelihood estimations and em algorithms with length-biased data. J Am Stat Assoc 106(496):1434–1449

    Article  MathSciNet  MATH  Google Scholar 

  • Shen PS (2009) Hazards regression for length-biased and right-censored data. Stat Probab Lett 79(4):457–465

    Article  MathSciNet  MATH  Google Scholar 

  • Shen Y, Ning J, Qin J (2009) Analyzing length-biased data with semiparametric transformation and accelerated failure time models. J Am Stat Assoc 104(487):1192–1202

    Article  MathSciNet  MATH  Google Scholar 

  • Shen Y, Ning J, Qin J (2012) Likelihood approaches for the invariant density ratio model with biased-sampling data. Biometrika 99(2):363–378

    Article  MathSciNet  MATH  Google Scholar 

  • Simon R (1980) Length biased sampling in etiologic studies. Am J Epidemiol 111(4):444–452

    Google Scholar 

  • Terwilliger JD, Shannon WD, Lathrop GM, Nolan JP, Goldin LR, Chase GA, Weeks DE (1997) True and false positive peaks in genomewide scans: applications of length-biased sampling to linkage mapping. Am J Hum Genet 61(2):430–438

    Article  Google Scholar 

  • Tsai WY (2009) Pseudo-partial likelihood for proportional hazards models with biased-sampling data. Biometrika 96(3):601–615. doi:10.1093/biomet/asp026

    Article  MathSciNet  MATH  Google Scholar 

  • Turnbull BW (1976) The empirical distribution function with arbitrarily grouped, censored and truncated data. J R Stat Soc Ser B (Methodol) 38(3):290–295

    MathSciNet  MATH  Google Scholar 

  • Vardi Y (1982) Nonparametric estimation in the presence of length bias. Ann Stat 10(2):616–620

    Article  MathSciNet  MATH  Google Scholar 

  • Vardi Y (1989) Multiplicative censoring, renewal processes, deconvolution and decreasing density: Nonparametric estimation. Biometrika 76:751–761

    Article  MathSciNet  MATH  Google Scholar 

  • Wang MC (1991) Nonparametric estimation from cross-sectional survival data. J Am Stat Assoc 86:130–143

    Article  MathSciNet  MATH  Google Scholar 

  • Wang MC (1996) Hazards regression analysis for length-biased data. Biometrika 83(2):343–354

    Article  MathSciNet  MATH  Google Scholar 

  • Wang MC, Jewell NP, Tsai WY (1986) Asymptotic properties of the product limit estimate under random truncation. Ann Stat 14:1597–1605

    Article  MathSciNet  MATH  Google Scholar 

  • Wang MC, Brookmeyer R, Jewell NP (1993) Statistical models for prevalent cohort data. Biometrics 49(1):1–11

    Article  MathSciNet  MATH  Google Scholar 

  • Wang X, Wang Q (2015) Estimation for semiparametric transformation models with length-biased sampling. J Stat Plan Inference 156:80–89

    Article  MathSciNet  MATH  Google Scholar 

  • Wolfson C, Wolfson DB, Asgharian M, M’Lan CE, Østbye T, Rockwood K, Df Hogan (2001) A reevaluation of the duration of survival after the onset of dementia. N Engl J Med 344(15):1111–1116

    Article  Google Scholar 

  • Woodroofe M (1985) Estimating a distribution function with truncated data. Ann Stat 13(1):163–177

    Article  MathSciNet  MATH  Google Scholar 

  • Zelen M (2004) Forward and backward recurrence times and length biased sampling: age specific models. Lifetime Data Anal 10(4):325–334

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang F, Chen X, Zhou Y (2014) Proportional hazards model with varying coefficients for length-biased data. Lifetime Data Anal 20(1):132–157

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

The work was supported in part by the U.S. NIH Grants CA079466 and CA016672. The authors thank Professor Asgharian and the investigators from the Canadian Study of Health and Aging for generously sharing the dementia data. The data reported in this article were collected as part of the Canadian Study of Health and Aging. The core study was funded by the Seniors’ Independence Research Program, through the National Health Research and Development Program (NHRDP) of Health Canada Project 6606-3954-MC(S). Additional funding was provided by Pfizer Canada Incorporated through the Medical Research Council/Pharmaceutical Manufacturers Association of Canada Health Activity Program, NHRDP Project 6603-1417-302(R), Bayer Incorporated, and the British Columbia Health Research Foundation Projects 38 (93-2) and 34 (96-1). The study was coordinated through the University of Ottawa and the Division of Aging and Seniors, Health Canada.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu Shen.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shen, Y., Ning, J. & Qin, J. Nonparametric and semiparametric regression estimation for length-biased survival data. Lifetime Data Anal 23, 3–24 (2017). https://doi.org/10.1007/s10985-016-9367-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-016-9367-y

Keywords

Navigation