Skip to main content
Log in

An overview of nonparametric contributions to the problem of functional estimation from biased data

  • Published:
Test Aims and scope Submit manuscript

Abstract

This paper presents an overview of nonparametric contributions to the literature on estimation problems when the observations are taken from weighted distributions. Many situations involving biased data in a very diverse range of contexts are considered, with emphasis being placed on the applications of smoothing techniques to estimate curves, such as density and regression functions. Some important problems encountered in semiparametric models are also analyzed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Ahmad, I.A. (1995). On multivariatc kernel estimation for samples from weighted distributions.Statistics and Probability Letters,22, 121–129.

    Article  MATH  MathSciNet  Google Scholar 

  • Alcalá, J.T., J.A. Cristóbal and J. Ojeda (2000). Nonparametric regression estimators in biased sampling models. InStatistical Modelling (V. Núñcz-Antón and E. Ferreira, eds.) Cniversidad del País Vasco, Bilbao, 131–136.

    Google Scholar 

  • Bayarri, M.J. and M.H. DeGroot (1992). A “BAD” view of weighted distributions and selection models. InBayesian statistics, 4, 17–33, Oxford University Press, New York.

    Google Scholar 

  • Bhattacharyya, B.B., L.A. Franklin and G.D. Richardson (1988). A comparison of nonparametric unweighted and length-biased density estimation of fibres.Communications in Statistics, Theory and Methods,17, 3629–3644.

    MATH  MathSciNet  Google Scholar 

  • Bickel, P.J., C.A. Klaassen, Y. Ritov and J.A. Wellner (1993).Efficient and Adaptive Estimation for Semiparametric Models. Johns Hopkins University Press, Baltimore.

    MATH  Google Scholar 

  • Bickel, P.J., and M. Rosenblatt (1973). On some global measures of the deviations of density function estimates.Annals of Statistics,1, 1071–1095.

    MATH  MathSciNet  Google Scholar 

  • Breslow, N.E. (1996). Statistics in epidemiology: the case-control study.Journal of the American Statistical Association,91, 14–28.

    Article  MATH  MathSciNet  Google Scholar 

  • Breslow, E., J.M. Robins and J.A. Wellner (2000). On the semi-parametric efficiency of logistic regression under case-control sampling.Bernoulli,6, 447–455.

    Article  MATH  MathSciNet  Google Scholar 

  • Burnham, K.P., D.R., Anderson, and J.L. Laake (1980). Estimation of density from line transect sampling of biological populations.Wildlife Monograph,72, supp. toJournal of Wildlife Management,44.

  • Cao, R., P. Janssen and N. Veraverbeke (2000). Relative density estimation with censored data.Canadian Journal of Statistics,28, 97–111.

    MATH  MathSciNet  Google Scholar 

  • Coleman, R. (1972). Sampling procedures for the lengths of random straight lines.Biometrika,59, 415–426.

    Article  MATH  MathSciNet  Google Scholar 

  • Cook, R.C. and F.B. Martin (1974). A model for quadrat sampling with visibility bias.Journal of the American Statistical Association,69, 345–349.

    Article  Google Scholar 

  • Cosslett, S.R. (1981). Efficient estimation of discrete choice models. InStructural Analysis of Discrete Data with Econometric Applications 51–111 (C.F. Mansky and D. McFadden, eds.) MIT Press, Cambridge.

    Google Scholar 

  • Cox, D.R. (1969). Some sampling problems in technology. InNew Developments in Survey Sampling, 506–527 (N.L. Johnson and H. Smith, eds.) John Wiley, New York.

    Google Scholar 

  • Cristóbal, J.A. and J.T. Alcalá (2000). Nonparametric regression estimators for length biased data.Journal of Statistical Planning and Inference,89, 145–168.

    Article  MATH  MathSciNet  Google Scholar 

  • Cristóbal, J.A., J. Ojeda and J.T. Alcalá (2001). Confidence bands in nonparametric regression with length biased data.Seminario Matemático García de Galdeano II,1, 1–28.

    Google Scholar 

  • Ćwik, J. and J. Mielniczuk (1993). Data-dependent bandwidth choice for a grade density kernel estimate.Statistics and Probability Letters,16, 397–405.

    Article  MathSciNet  Google Scholar 

  • de Uña-Álvarez, J. (2000a) Product-limit estimation for length-biased censored data. Unpublished manuscript.

  • de Uña-Álvarez, J. (2000b) Large sample results under length-biased sampling when covariables are present. Unpublished manuscript.

  • Drummer, T.D. and L.L. Mcdonald (1987). Size bias in line transect sampling.Biometrics,43, 13–21.

    Article  MATH  Google Scholar 

  • El Barni, H. and M.D. Rothmann (1998). Nonparametric estimation in selection biased models in the presence of estimating equations.Nonparametric Statistics,9, 381–399.

    MathSciNet  Google Scholar 

  • El Barni, H. and M.D. Rothmann (1999). Estimation of weighted multinomial probabilities under log-convex constraints.Journal of Statistical Planning and Inference,81, 1–11.

    Article  MathSciNet  Google Scholar 

  • El Barmi, H. and J.S. Simonoff (2000). Transformation-based density estimation for weighted distributions.Nonparametric Statistics,12, 861–878.

    MATH  MathSciNet  Google Scholar 

  • Fen, J. and I. Gijbels (1996).Local Polynomial Modelling and its Applications. Chapman and Hall, London.

    Google Scholar 

  • Feller, W. (1966).Introduction to Probability Theory and Applications, 2. John Wiley, New York.

    MATH  Google Scholar 

  • Fisher, R.A. (1934). The effects of methods of ascertainment upon the estimation of frequencies.Annals of Eugenics,6, 13–25.

    Google Scholar 

  • Gilbert, P.B. (2000). Large sample theory of maximum likelihood estimates in semiparametric biased sampling models.Annals of Statistics,28, 151–194.

    Article  MATH  MathSciNet  Google Scholar 

  • Gilbert, P.B., S.R. Lele and Y. Vardi (1999). Maximum likelihood estimation in semiparametric selection bias models with application to AIDS vaccine trials.Biometrika,86, 27–43.

    Article  MATH  MathSciNet  Google Scholar 

  • Gill, R.D., Y. Vardi, and J.A. Wellner (1988). Large sample theory of empirical distributions in biased sampling models.Annals of Statistics,16, 1069–1112.

    MATH  MathSciNet  Google Scholar 

  • Guillamón, A., J. Navarro and J.M. Ruiz (1998). Kernel density estimation using weighted data.Communications in Statistics. Theory and Methods,27, 2123–2135.

    MATH  MathSciNet  Google Scholar 

  • Hanin, L.G., S.T. Rachev, A.D. Tsodikov and Y. Yakovlev (1997). A stochastic model of carcinogenesis and tumor size at detection.Advances in Applied Probability,29, 607–628.

    Article  MATH  MathSciNet  Google Scholar 

  • Holt, D., T.M.F. Smith and P.D. Winter (1980). Regression analysis of data from complex surveys.Journal of the Royal Statistical Society, A,143, 474–487.

    Article  MATH  MathSciNet  Google Scholar 

  • Horowitz, J.L. (1996). Semiparametric estimation of a regression model with an unknown transformation of the dependent variable.Econometrica,64, 103–137.

    Article  MATH  MathSciNet  Google Scholar 

  • Horvath, L. (1985). Estimation from a length-biased distribution.Statistics and Decisions,3 91–113.

    MATH  MathSciNet  Google Scholar 

  • Jewell, N.P. (1985). Least squares regression with data arising from stratified samples of the dependent variable.Biometrika,72, 11–21.

    Article  MathSciNet  Google Scholar 

  • Jones, M.C. (1991). Kerned density estimation for length biased data.Biometrika,78, 511–519.

    Article  MathSciNet  Google Scholar 

  • Jones, M.C. and R.J. Kaumamuni (1997). Fourier series estimation for length biased data.Australian Journal of Statistics,39, 57–68.

    MATH  MathSciNet  Google Scholar 

  • Kay, R. and S. Little (1987). Transformation of the explanatory variables in the logistic regression model for binary data.Biometrika,74, 495–501.

    Article  MATH  MathSciNet  Google Scholar 

  • Klein, R. and R. Sherman (1997). Estimating new product demand from biased survey data.Journal of Econometrics,76, 53–76.

    Article  MATH  MathSciNet  Google Scholar 

  • Li, G. (1995). Nonparametric likelihood ratio estimation of probabilities for truncated data.Journal of the American Statistical Association,90, 997–1003.

    Article  MATH  MathSciNet  Google Scholar 

  • Li, G. and J. Qin (1998). Semiparametric likelihood-based inference for biased and truncated data when the total sample size is known.Journal of the Royal Statistical Society, B,60, 243–254.

    Article  MATH  MathSciNet  Google Scholar 

  • Li, G., J. Qin and C. Tiwari (1997). Semiparametric likelihood ratio-based inferences for truncated data.Journal of the American Statistical Association,92, 236–245.

    Article  MATH  MathSciNet  Google Scholar 

  • Lloyd, J. and M.C. Jones (2000). Nonparametric density estimation from biased data with unknown biasing function.Journal of the American Statistical Association,95, 865–876.

    Article  MATH  MathSciNet  Google Scholar 

  • Manski, C.F. and S. Lerman (1977). The estimation of choice probabilities from choice-based samples.Econometrica,45, 1977–1988.

    Article  MATH  MathSciNet  Google Scholar 

  • Møller, J. (1988). Stereological analysis of particles of varying ellipsoidal shape.Journal of Applied Probability,25, 322–335.

    Article  MATH  MathSciNet  Google Scholar 

  • Nair, V.N. and P.C.C. Wang (1989). Maximum likelihood estimation under a successive sampling discovery model.Technometrics,31, 423–436.

    Article  MATH  MathSciNet  Google Scholar 

  • Nathan, G. and D. Holt (1980). The effect of survey design on regression analysis.Journal of the Royal Statistical Society, B,42, 377–386.

    MATH  MathSciNet  Google Scholar 

  • Nicoll, J.F. and I.E. Segal (1982). Spatial homogenity and redshift-distance law.Proceedings of the National Academy of Sciences,79, 3913–3917.

    Article  MathSciNet  Google Scholar 

  • Owen, A. (1988). Empirical likelihood ratio confidence intervals for a single functional.Biometrika,75, 237–249.

    Article  MATH  MathSciNet  Google Scholar 

  • Patil, G.P. (1984). Studies in statistical ecology involving weighted distributions. InStatistics: Applications and New Directions. Indian Statistical Institute, 478–503.

  • Patil, G.P. and C.R. Rao (1977). The weighted distributions: a survey of their applications. InApplications of Statistics, 383–405 (R.R. Krishnaiah, ed.). North-Holland.

  • Patil, G.P. and C.R. Rao (1978). Weighted distributions. InEncyclopaedia of Statistical Sciences,9, 565–571 (S. Kotz and N. L. Johnson, eds.) John Wiley.

  • Qin, J. (1993). Empirical likelihood in biased sample problem.Annals of Statistics,21, 1182–1196.

    MATH  MathSciNet  Google Scholar 

  • Qin, J. (1998). Inferences for case-control and semiparametric two-sample density ratio models.Biometrika,85, 619–630.

    Article  MATH  MathSciNet  Google Scholar 

  • Qin, J. and J.F. Lawless (1994). Empirical likelihood and general estimating equations.Annals of Statistics,22, 300–325.

    MATH  MathSciNet  Google Scholar 

  • Quang, P.X. (1991). A nonparametric approach to size-biased line transect sampling.Biometrics,47, 269–279.

    Article  MathSciNet  Google Scholar 

  • Quesenberry, C.P. and N.P. Jewell (1986). Regression analysis based on stratified samples.Biometrika,73, 605–614.

    Article  MATH  MathSciNet  Google Scholar 

  • Rao, C.R. (1965). On discrete distributions arising out of methods of ascertainment. InClassical and Contagious Discrete. Distributions, 320–332 (G.P. Patil, ed.), Pergamon Press and Statistical Publishing Society, Calcutta.

    Google Scholar 

  • Rao, C.R. (1977). A natural example of a weighted binomial distribution.American Statistics,31, 24–26.

    Article  Google Scholar 

  • Rao, C.R. (1989).Statistics and truth. World Scientific, Singapore.

    Google Scholar 

  • Richardson, G.D., M.K. Kazempour and B.B. Bhattacharyya (1991). Length biased density estimation of fibres.Nonparametric Statistics,1, 127–141.

    MathSciNet  Google Scholar 

  • Robins, J.M., A. Rotnizky and L.P. Zhao (1994). Estimation of regression coefficients when some regressors are not always observed.Journal of the American Statistical Association,89, 846–866.

    Article  MATH  MathSciNet  Google Scholar 

  • Sen, P.K. (1984). On asymptotic representations for reduced quantiles in sampling from a length-biased distribution.Calcutta Statistical Association Bulletin,33, 59–67.

    MATH  MathSciNet  Google Scholar 

  • Sen, P.K. (1997). On estimators of bundle-strength in length-biased sampling schemes. InProbability and its Applications, 120–134 (M.C. Bhattacharjee and S.K. Basu, eds.) Oxford University Press, New Delhi.

    Google Scholar 

  • Sköld, M. (1999). Kernel regression in the presence of size-bias.Nonparametric Statistics,12, 41–51.

    MATH  MathSciNet  Google Scholar 

  • Smart, R.J. (1963). Alcoholism, birth order, and family size.Journal Abnorm. Society of Psychology,66, 17–23.

    Article  Google Scholar 

  • Sprott, D.A. (1964). Use of chi square.Journal Abnorm. Society of Psychology,69, 101–103.

    Article  Google Scholar 

  • Suh, M.W., B.B. Bhattacharya and A.H.E. Grandage (1970). On the distribution and moments of the strength of a bundle of filaments.Journal of Applied Probability,7, 712–720.

    Article  MATH  MathSciNet  Google Scholar 

  • Sun, J. and M. Woodroofe (1997). Semi-parametric estimates under biased sampling.Statistica Sinica,7, 545–575.

    MATH  MathSciNet  Google Scholar 

  • Tsui, K.L., N.P. Jewell and C.F.J. Wu (1988). A nonparametric approach to the truncated regression problem.Journal of the American Statistical Association,83, 785–792.

    Article  MATH  MathSciNet  Google Scholar 

  • Turner, E.L. (1979). Statistics of the Hubble diagram I: Determination of q0 and luminosity evolution with application to Quasars.Astrophysical Journal,230, 291–303.

    Article  Google Scholar 

  • Vardi, Y. (1982). Nonparametric estimation in the presence of length bias.Annals of Statistics,10, 616–620.

    MATH  MathSciNet  Google Scholar 

  • Vardi, Y. (1985). Empirical distributions in selection bias models.Annals of Statistics,13, 178–205.

    MATH  MathSciNet  Google Scholar 

  • Vardi, Y. (1988). Statistical models for intercepted data.Journal of the American Statistical Association,83, 183–197.

    Article  MATH  MathSciNet  Google Scholar 

  • Wang, M.C. (1989). A semiparametric model for randomly truncated data.Journal of the American Statistical Association,84, 742–748.

    Article  MATH  MathSciNet  Google Scholar 

  • Weinberg, C.R. and D.P. Sandler (1991). Randomized recruitment in case-control studies.American Journal of Epidemiology,134, 421–432.

    Google Scholar 

  • Winter, B.B. and A. Földes (1988). A product-limit estimator for use with length-biased data.Canadian Journal of Statistics,16, 337–355.

    MATH  Google Scholar 

  • Woodroofe, M. (1985). Estimating a distribution function with truncated data.Annals of Statistics,13, 163–177.

    MATH  MathSciNet  Google Scholar 

  • Wu, C.O. (1996). Kernel smoothing of the nonparametric maximum likelihood estimates for biased sampling models.Mathematical Methods in Statistics,5, 275–298.

    MATH  Google Scholar 

  • Wu, C.O. (1997). A cross-validation bandwidth choice for kernel density estimates with selection biased data.Journal of Multivariate Analysis,61, 38–60.

    Article  MATH  MathSciNet  Google Scholar 

  • Wu, C.O. (2000). Local Polynomial regression with selection biased data.Statistica Sinica,10, 789–817.

    MATH  MathSciNet  Google Scholar 

  • Wu, C.O. and A.Q. Mao (1996). Minimax kernels for density estimation with biased data.Annals of the Institute of Statistical Mathematics,48, 451–467.

    Article  MATH  MathSciNet  Google Scholar 

  • Zhang, B. (2000a). M-estimation under a two-sample semiparametric model.Scandinavian Journal of Statistics,27, 263–280.

    Article  MATH  Google Scholar 

  • Zhang, B. (2000b). Quantile estimation under a two-sample semi-parametric model.Bernoulli,6, 491–511.

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José A. Cristóbal.

Additional information

This work has been partially supported by the DGES Grant PB98-1587.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cristóbal, J.A., Alcalá, J.T. An overview of nonparametric contributions to the problem of functional estimation from biased data. Test 10, 309–332 (2001). https://doi.org/10.1007/BF02595700

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02595700

Key Words

AMS subject classification

Navigation