Imputation for semiparametric transformation models with biased-sampling data

Liu, Hao; Qin, Jing; Shen, Yu

doi:10.1007/s10985-012-9225-5

Imputation for semiparametric transformation models with biased-sampling data

Published: 18 August 2012

Volume 18, pages 470–503, (2012)
Cite this article

Lifetime Data Analysis Aims and scope Submit manuscript

Hao Liu¹,
Jing Qin² &
Yu Shen³

299 Accesses
5 Citations
Explore all metrics

Abstract

Widely recognized in many fields including economics, engineering, epidemiology, health sciences, technology and wildlife management, length-biased sampling generates biased and right-censored data but often provide the best information available for statistical inference. Different from traditional right-censored data, length-biased data have unique aspects resulting from their sampling procedures. We exploit these unique aspects and propose a general imputation-based estimation method for analyzing length-biased data under a class of flexible semiparametric transformation models. We present new computational algorithms that can jointly estimate the regression coefficients and the baseline function semiparametrically. The imputation-based method under the transformation model provides an unbiased estimator regardless whether the censoring is independent or not on the covariates. We establish large-sample properties using the empirical processes method. Simulation studies show that under small to moderate sample sizes, the proposed procedure has smaller mean square errors than two existing estimation procedures. Finally, we demonstrate the estimation procedure by a real data example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Asgharian M, Wolfson DB (2005) Asymptotic behavior of the unconditional NPMLE of the length-biased survivor function from right censored prevalent cohort data. Ann Stat 33(5): 2109–2131
Article MathSciNet MATH Google Scholar
Asgharian M, M’Lan CE, Wolfson DB (2002) Length-biased sampling with right censoring: an unconditional approach. J Am Stat Assoc 97(457): 201–209
Article MathSciNet MATH Google Scholar
Asgharian M, Wolfson DB, Zhang X (2006) Checking stationarity of the incidence rate using prevalent cohort survival data. Stat Med 25(10): 1751–1767
Article MathSciNet Google Scholar
Buckley J, James L (1979) Linear regression with censored data. Biometrika 66(3): 429–436
Article MATH Google Scholar
Chen YH (2009) Weighted breslow-type and maximum likelihood estimation in semiparametric transformation models. Biometrika 96(3): 591–600
Article MathSciNet MATH Google Scholar
Chen L, Lin DY, Zeng D (2012) Checking semiparametric transformation models with censored data. Biostatistics 13(1): 18–31
Article MATH Google Scholar
Cheng SC, Wei LJ, Ying Z (1995) Analysis of transformation models with censored data. Biometrika 82: 835–845
Article MathSciNet MATH Google Scholar
Cheng SC, Wei LJ, Ying Z (1997) Predicting survival probabilities with semiparametric transformation models. J Am Stat Assoc 92(437): 227–235
Article MathSciNet MATH Google Scholar
Clayton DG, Cuzick J (1985) Multivariate generalizations of the proportional hazards model. J R Stat Soc A 148(2): 82–117
Article MathSciNet MATH Google Scholar
Cox DR (1962) Renewal theory. Methuen, London
MATH Google Scholar
Cox DR (1969) Some sampling problems in technology. In: Johnson UL, Smith H (eds) New developments in survey sampling. Wiley Interscience, New York
Google Scholar
Dabrowska DM, Doksum KA (1988) Estimation and testing in a two-sample generalized odds-rate model. J Am Stat Assoc 83: 744–749
Article MathSciNet MATH Google Scholar
de Una-Álvarez J, Otero-Giráldez MS, Álvarez-Llorente G (2003) Estimation under length-bias and right-censoring: An application to unemployment duration analysis for married women. J Appl Stat 30(3): 283–291
Article MathSciNet MATH Google Scholar
Devroye L (1986) Non-uniform random variate generation. Springer, New York
MATH Google Scholar
Gupta RC, Keating JP (1986) Relations for reliability measures under length biased sampling. Scand J Stat 13(1): 49–56
MathSciNet MATH Google Scholar
Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data, 2nd edn. Wiley-Interscience, Hoboken
Book MATH Google Scholar
Keiding N (1992) Independent delayed entry. In: Klein JP, Goel P (eds) Survival analysis: state of the art. Kluwer, Boston, pp 309–326
Google Scholar
Kvam P (2008) Length bias in the measurements of carbon nanotubes. Technometrics 50(4): 462–467
Article MathSciNet Google Scholar
Lai TL, Ying Z (1991) Large sample theory of a modified buckley-james estimator for regression analysis with censored data. Ann Stat 19: 1370–1402
Article MathSciNet MATH Google Scholar
Lange K (2010) Numerical analysis for statisticians, 2nd edn. Springer, New York
Book MATH Google Scholar
Lin DY (2000) On fitting cox’s proportional hazards models to survey data. Biometrika 87(1): 37–47
Article MathSciNet MATH Google Scholar
Lin DY, Wei LJ, Yang I, Ying Z (2000) Semiparametric regression for the mean and rate functions of recurrent events. J R Stat Soc B 62(4): 711–730
Article MathSciNet MATH Google Scholar
Mauldon M (1998) Estimating mean fracture trace length and density from observations in convex windows. Rock Mech Rock Eng 31(4): 201–216
Article Google Scholar
Nowell C, Stanley LR (1991) Length-biased sampling in mall intercept surveys. J Market Res 28(4): 475–479
Article Google Scholar
Nowell C, Evans MA, McDonald L (1988) Length-biased sampling in contingent valuation studies. Land Econ 64(4): 367–371
Article Google Scholar
Ortega JM, Rheinboldt WC (2000) Iterative solution of nonlinear equations in several variables, Academic Press 1970 edn. Classics in applied mathematics. Society for Industrial and Applied Mathematics, Philadelphia
Otis DL, McDonald LL, Evans MA (1993) Parameter estimation in encounter sampling surveys. J Wildl Manag 57(3): 543–548
Article Google Scholar
Qin J, Shen Y (2010) Statistical methods for analyzing right-censored length-biased data under Cox model. Biometrics 66(2): 382–392
Article MathSciNet MATH Google Scholar
Scheike TH, Keiding N (2006) Design and analysis of time-to-pregnancy. Stat Methods Med Res 15(2): 127–140
Article MathSciNet MATH Google Scholar
Shen Y, Ning J, Qin J (2009) Analyzing length-biased data with semiparametric transformation and accelerated failure time models. J Am Stat Assoc 104(487): 1192–1202
Article MathSciNet Google Scholar
Simon R (1980) Length biased sampling in etiologic studies. Am J Epidemiol 111(4): 444–452
Google Scholar
Smith TMF (1993) Populations and selection: limitations of statistics. J R Stat Soc A (Stat Soc) 156(2): 144–166
Article Google Scholar
Terwilliger JD, Shannon WD, Lathrop GM, Nolan JP, Goldin LR, Chase GA, Weeks DE (1997) True and false positive peaks in genomewide scans: Applications of length-biased sampling to linkage mapping. Am J Hum Genet 61(2): 430–438
Article Google Scholar
Tricomi FG (1985) Integral equations. Dover Publications, New York
Google Scholar
van der Vaart A (1998) Asymptotic statistics. Cambridge Series in statistical and probabilistic mathematics. Cambridge University Press, Cambridge
Google Scholar
van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes. Springer, New York
MATH Google Scholar
Varadhan R, Frangakis CF (2004) Revealing and addressing length bias and heterogeneous effects in frequency case-crossover studies. Am J Epidemiol 159(6): 596–602
Article Google Scholar
Vardi Y (1982) Nonparametric estimation in the presence of length bias. Ann Stat 10(2): 616–620
Article MathSciNet MATH Google Scholar
Vardi Y (1989) Multiplicative censoring, renewal processes, deconvolution and decreasing density: Nonparametric estimation. Biometrika 76: 751–761
Article MathSciNet MATH Google Scholar
Ventura V, Davison AC, Boniface SJ (1998) Statistical inference for the effect of magnetic brain stimulation on a motoneurone. J R Stat Soc C 47(1): 77–94
Article MATH Google Scholar
Wang MC (1991) Nonparametric estimation from cross-sectional survival data. J Am Stat Assoc 86: 130–143
Article MATH Google Scholar
Wang MC (1996) Hazards regression analysis for length-biased data. Biometrika 83(2): 343–354
Article MathSciNet MATH Google Scholar
Wang MC, Brookmeyer R, Jewell NP (1993) Statistical models for prevalent cohort data. Biometrics 49(1): 1–11
Article MathSciNet MATH Google Scholar
Zeidman MI, Batra SK, Sasser PE (1991) Determining short fiber content in cotton. Part I: Some theoretical fundamentals. Text Res J 61(1): 21–30
Article Google Scholar
Zelen M (2004) Forward and backward recurrence times and length biased+ sampling: Age specific models. Lifet Data Anal 10(4): 325–334
Article MathSciNet MATH Google Scholar
Zelen M, Feinleib M (1969) On the theory of screening for chronic diseases. Biometrika 56: 601–614
Article MathSciNet MATH Google Scholar
Zeng D, Lin DY (2007) Maximum likelihood estimation in semiparametric regression models with censored data. J R Stat Soc B 69(4): 507–564
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Division of Biostatistics, Dan L. Duncan Cancer Center, Baylor College of Medicine, Houston, TX, 77030, USA
Hao Liu
Biostatistics Research Branch, National Institute of Allergy and Infectious Diseases, National Institute of Health Bethesda, Bethesda, MD, 20892, USA
Jing Qin
Department of Biostatistics, The University of Texas M. D. Anderson Cancer Center, Houston, TX, 77030, USA
Yu Shen

Authors

Hao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jing Qin
View author publications
You can also search for this author in PubMed Google Scholar
Yu Shen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Liu.

Electronic Supplementary Material

The Below is the Electronic Supplementary Material.

ESM 1 (PDF 128 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, H., Qin, J. & Shen, Y. Imputation for semiparametric transformation models with biased-sampling data. Lifetime Data Anal 18, 470–503 (2012). https://doi.org/10.1007/s10985-012-9225-5

Download citation

Received: 30 November 2011
Accepted: 01 August 2012
Published: 18 August 2012
Issue Date: October 2012
DOI: https://doi.org/10.1007/s10985-012-9225-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Imputation for semiparametric transformation models with biased-sampling data

Abstract

Access this article

Similar content being viewed by others

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Violating the normality assumption may be the lesser of two evils

Handling Missing Data in Principal Component Analysis Using Multiple Imputation

References

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

ESM 1 (PDF 128 kb)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Imputation for semiparametric transformation models with biased-sampling data

Abstract

Access this article

Similar content being viewed by others

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Violating the normality assumption may be the lesser of two evils

Handling Missing Data in Principal Component Analysis Using Multiple Imputation

References

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

ESM 1 (PDF 128 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation