Abstract
Widely recognized in many fields including economics, engineering, epidemiology, health sciences, technology and wildlife management, length-biased sampling generates biased and right-censored data but often provide the best information available for statistical inference. Different from traditional right-censored data, length-biased data have unique aspects resulting from their sampling procedures. We exploit these unique aspects and propose a general imputation-based estimation method for analyzing length-biased data under a class of flexible semiparametric transformation models. We present new computational algorithms that can jointly estimate the regression coefficients and the baseline function semiparametrically. The imputation-based method under the transformation model provides an unbiased estimator regardless whether the censoring is independent or not on the covariates. We establish large-sample properties using the empirical processes method. Simulation studies show that under small to moderate sample sizes, the proposed procedure has smaller mean square errors than two existing estimation procedures. Finally, we demonstrate the estimation procedure by a real data example.
Similar content being viewed by others
References
Asgharian M, Wolfson DB (2005) Asymptotic behavior of the unconditional NPMLE of the length-biased survivor function from right censored prevalent cohort data. Ann Stat 33(5): 2109–2131
Asgharian M, M’Lan CE, Wolfson DB (2002) Length-biased sampling with right censoring: an unconditional approach. J Am Stat Assoc 97(457): 201–209
Asgharian M, Wolfson DB, Zhang X (2006) Checking stationarity of the incidence rate using prevalent cohort survival data. Stat Med 25(10): 1751–1767
Buckley J, James L (1979) Linear regression with censored data. Biometrika 66(3): 429–436
Chen YH (2009) Weighted breslow-type and maximum likelihood estimation in semiparametric transformation models. Biometrika 96(3): 591–600
Chen L, Lin DY, Zeng D (2012) Checking semiparametric transformation models with censored data. Biostatistics 13(1): 18–31
Cheng SC, Wei LJ, Ying Z (1995) Analysis of transformation models with censored data. Biometrika 82: 835–845
Cheng SC, Wei LJ, Ying Z (1997) Predicting survival probabilities with semiparametric transformation models. J Am Stat Assoc 92(437): 227–235
Clayton DG, Cuzick J (1985) Multivariate generalizations of the proportional hazards model. J R Stat Soc A 148(2): 82–117
Cox DR (1962) Renewal theory. Methuen, London
Cox DR (1969) Some sampling problems in technology. In: Johnson UL, Smith H (eds) New developments in survey sampling. Wiley Interscience, New York
Dabrowska DM, Doksum KA (1988) Estimation and testing in a two-sample generalized odds-rate model. J Am Stat Assoc 83: 744–749
de Una-Álvarez J, Otero-Giráldez MS, Álvarez-Llorente G (2003) Estimation under length-bias and right-censoring: An application to unemployment duration analysis for married women. J Appl Stat 30(3): 283–291
Devroye L (1986) Non-uniform random variate generation. Springer, New York
Gupta RC, Keating JP (1986) Relations for reliability measures under length biased sampling. Scand J Stat 13(1): 49–56
Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data, 2nd edn. Wiley-Interscience, Hoboken
Keiding N (1992) Independent delayed entry. In: Klein JP, Goel P (eds) Survival analysis: state of the art. Kluwer, Boston, pp 309–326
Kvam P (2008) Length bias in the measurements of carbon nanotubes. Technometrics 50(4): 462–467
Lai TL, Ying Z (1991) Large sample theory of a modified buckley-james estimator for regression analysis with censored data. Ann Stat 19: 1370–1402
Lange K (2010) Numerical analysis for statisticians, 2nd edn. Springer, New York
Lin DY (2000) On fitting cox’s proportional hazards models to survey data. Biometrika 87(1): 37–47
Lin DY, Wei LJ, Yang I, Ying Z (2000) Semiparametric regression for the mean and rate functions of recurrent events. J R Stat Soc B 62(4): 711–730
Mauldon M (1998) Estimating mean fracture trace length and density from observations in convex windows. Rock Mech Rock Eng 31(4): 201–216
Nowell C, Stanley LR (1991) Length-biased sampling in mall intercept surveys. J Market Res 28(4): 475–479
Nowell C, Evans MA, McDonald L (1988) Length-biased sampling in contingent valuation studies. Land Econ 64(4): 367–371
Ortega JM, Rheinboldt WC (2000) Iterative solution of nonlinear equations in several variables, Academic Press 1970 edn. Classics in applied mathematics. Society for Industrial and Applied Mathematics, Philadelphia
Otis DL, McDonald LL, Evans MA (1993) Parameter estimation in encounter sampling surveys. J Wildl Manag 57(3): 543–548
Qin J, Shen Y (2010) Statistical methods for analyzing right-censored length-biased data under Cox model. Biometrics 66(2): 382–392
Scheike TH, Keiding N (2006) Design and analysis of time-to-pregnancy. Stat Methods Med Res 15(2): 127–140
Shen Y, Ning J, Qin J (2009) Analyzing length-biased data with semiparametric transformation and accelerated failure time models. J Am Stat Assoc 104(487): 1192–1202
Simon R (1980) Length biased sampling in etiologic studies. Am J Epidemiol 111(4): 444–452
Smith TMF (1993) Populations and selection: limitations of statistics. J R Stat Soc A (Stat Soc) 156(2): 144–166
Terwilliger JD, Shannon WD, Lathrop GM, Nolan JP, Goldin LR, Chase GA, Weeks DE (1997) True and false positive peaks in genomewide scans: Applications of length-biased sampling to linkage mapping. Am J Hum Genet 61(2): 430–438
Tricomi FG (1985) Integral equations. Dover Publications, New York
van der Vaart A (1998) Asymptotic statistics. Cambridge Series in statistical and probabilistic mathematics. Cambridge University Press, Cambridge
van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes. Springer, New York
Varadhan R, Frangakis CF (2004) Revealing and addressing length bias and heterogeneous effects in frequency case-crossover studies. Am J Epidemiol 159(6): 596–602
Vardi Y (1982) Nonparametric estimation in the presence of length bias. Ann Stat 10(2): 616–620
Vardi Y (1989) Multiplicative censoring, renewal processes, deconvolution and decreasing density: Nonparametric estimation. Biometrika 76: 751–761
Ventura V, Davison AC, Boniface SJ (1998) Statistical inference for the effect of magnetic brain stimulation on a motoneurone. J R Stat Soc C 47(1): 77–94
Wang MC (1991) Nonparametric estimation from cross-sectional survival data. J Am Stat Assoc 86: 130–143
Wang MC (1996) Hazards regression analysis for length-biased data. Biometrika 83(2): 343–354
Wang MC, Brookmeyer R, Jewell NP (1993) Statistical models for prevalent cohort data. Biometrics 49(1): 1–11
Zeidman MI, Batra SK, Sasser PE (1991) Determining short fiber content in cotton. Part I: Some theoretical fundamentals. Text Res J 61(1): 21–30
Zelen M (2004) Forward and backward recurrence times and length biased+ sampling: Age specific models. Lifet Data Anal 10(4): 325–334
Zelen M, Feinleib M (1969) On the theory of screening for chronic diseases. Biometrika 56: 601–614
Zeng D, Lin DY (2007) Maximum likelihood estimation in semiparametric regression models with censored data. J R Stat Soc B 69(4): 507–564
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
The Below is the Electronic Supplementary Material.
Rights and permissions
About this article
Cite this article
Liu, H., Qin, J. & Shen, Y. Imputation for semiparametric transformation models with biased-sampling data. Lifetime Data Anal 18, 470–503 (2012). https://doi.org/10.1007/s10985-012-9225-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10985-012-9225-5