Statistical Papers

, Volume 55, Issue 1, pp 49–69 | Cite as

Fast nonparametric classification based on data depth

Regular Article

Abstract

A new procedure, called DDα-procedure, is developed to solve the problem of classifying d-dimensional objects into q ≥ 2 classes. The procedure is nonparametric; it uses q-dimensional depth plots and a very efficient algorithm for discrimination analysis in the depth space [0,1]q. Specifically, the depth is the zonoid depth, and the algorithm is the α-procedure. In case of more than two classes several binary classifications are performed and a majority rule is applied. Special treatments are discussed for ‘outsiders’, that is, data having zero depth vector. The D Dα-classifier is applied to simulated as well as real data, and the results are compared with those of similar procedures that have been recently proposed. In most cases the new procedure has comparable error rates, but is much faster than other classification approaches, including the support vector machine.

Keywords

Alpha-procedure Zonoid depth DD-plot Pattern recognition Supervised learning Misclassification rate Support vector machine 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Asuncion A, Newman D (2007) UCI machine learning repository. http://archive.ics.uci.edu/ml/
  2. Cascos I (2009) Data depth: multivariate statistics and geometry. In: Kendall W, Molchanov I (eds) New perspectives in stochastic geometry. Oxford University Press, OxfordGoogle Scholar
  3. Christmann A, Rousseeuw PJ (2001) Measuring overlap in binary regression. Comput Stat Data Anal 37: 65–75CrossRefMATHMathSciNetGoogle Scholar
  4. Christmann A, Fischer P, Joachims T (2002) Comparison between various regression depth methods and the support vector machine to approximate the minimum number of misclassifications. Comput Stat 17: 273–287CrossRefMATHMathSciNetGoogle Scholar
  5. Cuesta-Albertos JA, Nieto-Reyes A (2008) The random Tukey depth. Comput Stat Data Anal 52: 4979–4988CrossRefMATHMathSciNetGoogle Scholar
  6. Dutta S, Ghosh AK (2011) On classification based on L p depth with an adaptive choice of p (Preprint 2011)Google Scholar
  7. Dutta S, Ghosh AK (2012) On robust classification using projection depth. Ann Inst Stat Math 64: 657–676CrossRefMATHMathSciNetGoogle Scholar
  8. Dyckerhoff R (2004) Data depths satisfying the projection property. AStA 88: 163–190CrossRefMATHMathSciNetGoogle Scholar
  9. Dyckerhoff R, Koshevoy G, Mosler K (1996) Zonoid data depth: theory and computation. In: Prat A (ed) COMPSTAT 1996 Proceedings in computational statistics. Physica-Verlag, Heidelberg, pp 235–240Google Scholar
  10. Ghosh AK, Chaudhuri P (2005) On data depth and distribution free discriminant analysis using separating surfaces. Bernoulli 11: 1–27CrossRefMATHMathSciNetGoogle Scholar
  11. Ghosh AK, Chaudhuri P (2005) On maximum depth and related classifiers. Scand J Stat 32: 327–350CrossRefMATHMathSciNetGoogle Scholar
  12. Hastie T, Tibshirani R, Friedman JH (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New YorkGoogle Scholar
  13. Hubert M, van Driessen K (2004) Fast and robust discriminant analysis. Comput Stat Data Anal 45:301–320Google Scholar
  14. Jornsten R (2004) Clustering and classification based on the L1 data depth. J Multivar Anal 90: 67–89CrossRefMathSciNetGoogle Scholar
  15. Koshevoy G, Mosler K (1997) Zonoid trimming for multivariate distributions. Ann Stat 25: 1998–2017CrossRefMATHMathSciNetGoogle Scholar
  16. Lange T, Mozharovskyi P, Barath G (2011) Two approaches for solving tasks of pattern recognition and reconstruction of functional dependencies. XIV International conference on applied stochastic models and data analysis, RomeGoogle Scholar
  17. Li J, Cuesta-Albertos JA, Liu RY (2012) DD-classifier: nonparametric classification procedure based on DD-plot. J Am Stat Assoc 107: 737–753CrossRefMATHMathSciNetGoogle Scholar
  18. Liu RY (1990) On a notion of data depth based on random simplices. Ann Stat 18: 405–414CrossRefMATHGoogle Scholar
  19. Liu RY, Parelius J, Singh K (1999) Multivariate analysis of the data-depth: descriptive statistics and inference. Ann Stat 27: 783–858MATHMathSciNetGoogle Scholar
  20. Mahalanobis P (1936) On the generalized distance in statistics. Proc Natl Acad India 12: 49–55Google Scholar
  21. Mosler K (2002) Multivariate dispersion, central regions and depth: the lift zonoid approach. Springer, New YorkGoogle Scholar
  22. Mosler K, Hoberg R (2006) Data analysis and classification with the zonoid depth. In: Liu R, Serfling R, Souvaine D (eds) Data depth: robust multivariate analysis, computational geometry and applications, pp 49–59Google Scholar
  23. Rousseeuw PJ, Hubert M (1999) Regression depth. J Am Stat Assoc 94: 388–433CrossRefMATHMathSciNetGoogle Scholar
  24. Serfling R (2006) Depth functions in nonparametric multivariate inference. In: Liu R, Serfling R, Souvaine D (eds) Data depth: robust multivariate analysis, computational geometry and applications, pp 1–16Google Scholar
  25. Tukey JW (1974) Mathematics and the picturing of data. In: Proceeding of the international congress of mathematicians, Vancouver, pp 523–531Google Scholar
  26. Vapnik VN (1998) Statistical learning theory. Wiley, New YorkGoogle Scholar
  27. Vasil’ev VI (1991) The reduction principle in pattern recognition learning (PRL) problem. Pattern Recogn Image Anal 1:1Google Scholar
  28. Vasil’ev VI (2003) The reduction principle in problems of revealing regularities I. Cybern Syst Anal 39: 686–694CrossRefMATHGoogle Scholar
  29. Vasil’ev VI, Lange T (1998) The duality principle in learning for pattern recognition (in Russian). Kibernetika i Vytschislit’elnaya Technika 121: 7–16Google Scholar
  30. Zuo YJ, Serfling R (2000) General notions of statistical depth function. Ann Stat 28: 461–482CrossRefMATHMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Tatjana Lange
    • 1
  • Karl Mosler
    • 2
  • Pavlo Mozharovskyi
    • 2
  1. 1.Hochschule Merseburg, Geusaer StraßeMerseburgGermany
  2. 2.Universität zu Köln, Albertus-Magnus-PlatzKölnGermany

Personalised recommendations