Abstract
For the random variables Y, X 1,..., X p, where Y is binary, let M(x 1,..., x p) = P(Y = 1
(X1,... X p) = (x 1,... x p)). The article compares four smoothers aimed at estimating M(x 1,...,x p), three of which can be used when p > 1. Evidently there are no published comparisons of smoothers when p > 1 and Y is binary. One of the estimators stems from Hosmer and Lemeshow (1989, 85), which is limited to p = 1. A simple modification of this estimator (called method E3 here) is proposed that can be used when p > 1. Roughly, a weighted mean of the Y values is used, where the weights are based on a robust analog of Mahalanobis distance that replaces the usual covariance matrix with the minimum volume estimator. Another estimator stems from Signorini and Jones (1984) and is based in part on an estimate of the probability density function of X 1,..., X p. Here, an adaptive kernel density estimator is used. No estimator dominated in terms of mean squared error and bias. And for p = 1, the differences among three of the estimators, in terms of mean squared error and bias, are not particularly striking. But for p > 1, differences among the estimators are magnified, with method E3 performing relatively well. An estimator based on the running interval smoother performs about as well as E3, but for general use, E3 is found to be preferable. The estimator studied by Signorini and Jones (1984) is not recommended, particularly when p > 1.
Similar content being viewed by others
References
Cleveland, W. S. 1979. Robust locally weighted regression and smoothing scatterplots. J. Am. Stat. Assoc., 74, 829–836.
Efromovich, S. 1999. Nonparametric curve estimation: Methods, theory and applications. NewYork, Springer-Verlag.
Eubank, R. L. 1999. Nonparametric regression and spline smoothing. New York, Marcel Dekker.
Fan, J. 1993. Local linear smoothers and their minimax efficiencies. Ann. Stat., 21, 196–216.
Fan, J., and I. Gijbels. 1996. Local polynomial modeling and its applications. Boca Raton, FL, CRC Press.
Fox, J. 2001. Multiple and generalized nonparametric regression. Thousands Oaks, CA, Sage.
Green, P. J., and B. W. Silverman. 1993. Nonparametric regression and generalized linear models: A roughness penalty approach. Boca Raton, FL, CRC Press.
Györfi, L., M. Kohler, A. Krzyzk, and H. Walk. 2002. A distribution-free theory of nonparametric regression. New York, Springer Verlag.
Härdle, W. 1990. Applied nonparametric regression. Econometric Society Monographs No. 19. Cambridge, UK, Cambridge University Press.
Hastie, T. J., and R. J. Tibshirani. 1990. Generalized additive models. New York, Chapman and Hall.
Hoaglin, D. C. 1985. Summarizing shape numerically: The g-and-h distributions. In Exploring data tables, trends, and shapes, ed. D. Hoaglin, F. Mosteller, and J. Tukey, 461–515. New York, Wiley.
Hosmer, D. W., and S. Lemeshow. 1989. Applied logistic regression. New York, Wiley.
Kay, R. and S. Little. 1987. Transformation of the explanatory variables in the logistic regression model for binary data. Biometrika, 74, 495–501.
Rousseeuw, P. J., and A. M. Leroy. 1987. Robust regression & outlier detection. New York, Wiley.
Signorini, D. F., and M. C. Jones. 2004. Kernel estimators for univariate binary regression. J. Am. Stat. Assoc., 99, 119–126.
Silverman, B. W. 1986. Density estimation for statistics and data analysis. New York, Chapman and Hall.
Wilcox, R. R. 2005. Introduction to robust estimation and hypothesis testing, 2nd ed. San Diego, CA, Academic Press.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wilcox, R.R. Nonparametric Regression When Estimating the Probability of Success: A Comparison of Four Extant Estimators. J Stat Theory Pract 6, 443–451 (2012). https://doi.org/10.1080/15598608.2012.695639
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1080/15598608.2012.695639