For the random variables Y, X1,..., Xp, where Y is binary, let M(x1,..., xp) = P(Y = 1
(X1,... Xp) = (x1,... xp)). The article compares four smoothers aimed at estimating M(x1,...,xp), three of which can be used when p > 1. Evidently there are no published comparisons of smoothers when p > 1 and Y is binary. One of the estimators stems from Hosmer and Lemeshow (1989, 85), which is limited to p = 1. A simple modification of this estimator (called method E3 here) is proposed that can be used when p > 1. Roughly, a weighted mean of the Y values is used, where the weights are based on a robust analog of Mahalanobis distance that replaces the usual covariance matrix with the minimum volume estimator. Another estimator stems from Signorini and Jones (1984) and is based in part on an estimate of the probability density function of X1,..., Xp. Here, an adaptive kernel density estimator is used. No estimator dominated in terms of mean squared error and bias. And for p = 1, the differences among three of the estimators, in terms of mean squared error and bias, are not particularly striking. But for p > 1, differences among the estimators are magnified, with method E3 performing relatively well. An estimator based on the running interval smoother performs about as well as E3, but for general use, E3 is found to be preferable. The estimator studied by Signorini and Jones (1984) is not recommended, particularly when p > 1.
AMS Subject Classification
Kernel estimators Logistic regression Smoothers
This is a preview of subscription content, log in to check access.
Fan, J., and I. Gijbels. 1996. Local polynomial modeling and its applications. Boca Raton, FL, CRC Press.MATHGoogle Scholar
Fox, J. 2001. Multiple and generalized nonparametric regression. Thousands Oaks, CA, Sage.MATHGoogle Scholar
Green, P. J., and B. W. Silverman. 1993. Nonparametric regression and generalized linear models: A roughness penalty approach. Boca Raton, FL, CRC Press.MATHGoogle Scholar
Györfi, L., M. Kohler, A. Krzyzk, and H. Walk. 2002. A distribution-free theory of nonparametric regression. New York, Springer Verlag.CrossRefGoogle Scholar
Härdle, W. 1990. Applied nonparametric regression. Econometric Society Monographs No. 19. Cambridge, UK, Cambridge University Press.CrossRefGoogle Scholar
Hastie, T. J., and R. J. Tibshirani. 1990. Generalized additive models. New York, Chapman and Hall.MATHGoogle Scholar
Hoaglin, D. C. 1985. Summarizing shape numerically: The g-and-h distributions. In Exploring data tables, trends, and shapes, ed. D. Hoaglin, F. Mosteller, and J. Tukey, 461–515. New York, Wiley.MATHGoogle Scholar
Hosmer, D. W., and S. Lemeshow. 1989. Applied logistic regression. New York, Wiley.MATHGoogle Scholar
Kay, R. and S. Little. 1987. Transformation of the explanatory variables in the logistic regression model for binary data. Biometrika, 74, 495–501.MathSciNetCrossRefGoogle Scholar
Rousseeuw, P. J., and A. M. Leroy. 1987. Robust regression & outlier detection. New York, Wiley.CrossRefGoogle Scholar