Skip to main content
Log in

Localized classification

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

The main problem with localized discriminant techniques is the curse of dimensionality, which seems to restrict their use to the case of few variables. However, if localization is combined with a reduction of dimension the initial number of variables is less restricted. In particular it is shown that localization yields powerful classifiers even in higher dimensions if localization is combined with locally adaptive selection of predictors. A robust localized logistic regression (LLR) method is developed for which all tuning parameters are chosen data-adaptively. In an extended simulation study we evaluate the potential of the proposed procedure for various types of data and compare it to other classification procedures. In addition we demonstrate that automatic choice of localization, predictor selection and penalty parameters based on cross validation is working well. Finally the method is applied to real data sets and its real world performance is compared to alternative procedures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Albert A. and Anderson J.A. 1984. On the existence of maximum likelihood estimates in logistic regression models. Biometrika 71(1): 1–10.

    Google Scholar 

  • Bellman R.E. 1961. Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton, NJ.

    Google Scholar 

  • Bishop C.M. 1995. Neural Networks for Pattern Recognition. Clarendon Press, Oxford.

    Google Scholar 

  • Blake C. and Merz C. 1998. ‘UCI Repository of machine learning databases’.

  • Breiman L. 1996. Bagging predictors. Machine Learning 24(2): 123–140.

    Google Scholar 

  • Breiman L. 1999. Prediction games and arcing algorithms. Neural Computation 11: 1493–1517.

    Article  PubMed  Google Scholar 

  • Breiman L. 2001. Random forests. Machine Learning 45(1): 5–32.

    Article  Google Scholar 

  • Breiman L. 2002. Manual On Setting Up, Using, And Understanding Random Forests V3.1.

  • Breiman L., Friedman J.H., Olshen R.A., and Stone C.J. 1984. Classification and Regression Trees. Wadsworth.

  • Bühlmann P. and Yu B. 2003. Boosting with the L2 loss: Regression and classification. Journal of the American Statistical Association 98: 324–339.

    Article  Google Scholar 

  • Chapelle O., Vapnik V., and Weston J. 1999. Transductive inference for estimating values of functions. Advances in Neural Information Processing Systems 12.

  • Efron B., Hastie T., Johnstone I., and Tibshirani R. 2004. Least angle regression. The Annals of Statistics 32(2): 407–499.

    Article  Google Scholar 

  • Fan J. and Gijbels I. 1996. Local Polynomial Modelling and its Applications, Chapman & Hall, London.

    Google Scholar 

  • Fix E. and Hodges J.L. 1951. Discriminatory analysis, nonparametric discimination, consistency properties. Technical Report 4, United States Air Force, School of Aviation Medicine, Randolph Field, TX.

    Google Scholar 

  • Friedman J.H. 1994. Flexible metric nearest neighbor classification. Technical report, Standford University.

  • Friedman J.H. 2001. Greedy function approximation: A gradient boosting machine. Annals of Statistics 29: 1189–1232.

    Article  Google Scholar 

  • Friedman J.H., Hastie T., and Tibshirani R. 2000. Additive logistic regression: A statistical view of boosting. Annals of Statistics 28: 337–407.

    Article  Google Scholar 

  • Gorman R.P. and Sejnowski T.J. 1988. Analysis of hidden units in a layered network trained to classify sonar targets. Neural Networks 1: 75–89.

    Article  Google Scholar 

  • Hand D.J. and Vinciotti V. 2003. Local versus global models for classification problems: Fitting models where it matters. The American Statistician 57(2): 124–131.

    Article  Google Scholar 

  • Hastie T. and Tibshirani R. 1996. Discriminant adaptive nearest neighbor classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 18(6): 607–615.

    Article  Google Scholar 

  • Hastie T., Tibshirani R., and Friedman J. 2001. The Elements of Statistical Learning, Springer, New York.

    Google Scholar 

  • Hoerl A.E. and Kennard R.W. 1970. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12(1): 55– 67.

    Google Scholar 

  • Holm S. 1979. A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6: 65–70.

    Google Scholar 

  • Ihaka R. and Gentleman R. 1996. A language for data analysis and graphics. Journal of Computational and Graphical Statistics 51(3): 299–314.

    Google Scholar 

  • Kauermann G. and Tutz G. 2000. Local likelihood estimates and bias reduction in varying coefficients models. Journal of Nonparametric Statistics 12: 343–371.

    Google Scholar 

  • Kira K. and Rendell L.A. 1992. A practical approach to feature selection. In: Sleeman D. and Edwards P. (Eds.), Machine Learning. Proceedings of the Ninth International Workshop (ML92). Morgan Kaufmann, San Mateo, pp. 249–256.

    Google Scholar 

  • Kohavi R. and John G.H. 1998. The wrapper approach. In: Liu H. and Motoda H. (Eds.), Feature Extraction, Construction and Selection. A Data Mining Perspective. Kluwer Academic Publishers, Dordrecht, pp. 33–50.

    Google Scholar 

  • Le Cessie S. and van Houwelingen J.C. 1992. Ridge estimators in logistic regression. Applied Statistics 41(1): 191–201.

    Google Scholar 

  • Loader C. 1999. Local Regression and Likelihood. Springer, New York.

    Google Scholar 

  • Michie D., Spiegelhalter D.J., and Taylor C.C. 1994. Machine Learning, Neural and Statistical Classification. Ellis Horwood, New York.

    Google Scholar 

  • Powell M.J.D. 2002. UOBYQA: Unconstrained optimization by quadratic approximation. Math. Program. 92: 555–582.

    Article  Google Scholar 

  • Ripley B.D. 1996. Pattern Recognition and Neural Networks. Cambridge: Cambridge University Press.

    Google Scholar 

  • Schaal S., Vijayakumar S., and Atkeson C.G. 1998. Local dimensionality reduction. In: Jordan M.I., Kearns M.J., and Solla S.A. (Eds.), Advances in Neural Information Processing Systems 10. MIT Press, Cambridge, MA.

    Google Scholar 

  • Tibshirani R. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society B 58(1): 267–288.

    Google Scholar 

  • Venables W.N. and Ripley B.D. 1999. Modern Applied Statistics With S-Plus. 3rd edition, Springer.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to G. Tutz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tutz, G., Binder, H. Localized classification. Stat Comput 15, 155–166 (2005). https://doi.org/10.1007/s11222-005-1305-x

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-005-1305-x

Keywords

Navigation