Abstract
Over the last two decades, multivariate sign and rank based methods have become popular in analysing multivariate data. In this paper, we propose a classification methodology based on the distribution of multivariate rank functions. The proposed method is fully nonparametric in nature. Initially, we consider a theoretical version of the classifier for K populations and show that it is equivalent to the Bayes rule for spherically symmetric distributions with a location shift. Then we present the empirical version of that and show that the apparent misclassification rate of the empirical version of the classifier converges asymptotically to the Bayes risk. We also present an affine invariant version of the classifier and its optimality for elliptically symmetric distributions. We illustrate the performance in comparison with some other depth based classifiers using simulated and real data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chakraborty, B.: On affine equivariant multivariate quantiles. Ann. Inst. Stat. Math. 53, 380–403 (2001)
Charytanowicz, M., Niewczas, J., Kulczycki, P., Kowalski, P.A., Lukasik, S. Zak, S.: UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science (2012). http://mlr.cs.umass.edu/ml/datasets/seeds
Christmann, A., Rousseeuw, P.: Measuring overlap in binary regression. Comput. Stat. Data Anal. 37, 65–75 (2001)
Christmann, A., Fischer, P., Joachims, T.: Comparison between various regression depth methods and the support vector machine to approximate the minimum number of misclassifications. Comput. Stat. 17, 273–287 (2002)
Cox, L.H., Johnson, M.M., Kafadar, K.: Exposition of statistical graphics technology. ASA Proceedings of the Statistical Computation Section, pp. 55–56 (1982)
Cui, X., Lin, L., Yang, G.R.: An extended projection data depth and its applications to discrimination. Commun. Stat.-Theory Methods 37, 2276–2290 (2008)
Durrett, R.: Probability: Theory and Examples, 4th edn. Cambridge University Press, Boston (2010)
Dutta, S., Ghosh, A.K.: On robust classification using projection depth. Ann. Inst. Stat. Math. 64, 657–676 (2012a)
Dutta, S. and Ghosh, A. K.: On classification based on L p depth with an adaptive choice of p. Technical Report No. R5/2011, Statistics and Mathematics Unit. Indian Statistical Institute, Kolkata, India (2012b)
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7, 179–188 (1936)
Fukunaga, K.: Introduction to Statistical Pattern Recognition. Academic Press, New York (1990)
Ghosh, A.K. Chaudhuri, P.: On data depth and distribution-free discriminant analysis using separating surfaces. Bernoulli 11, 1–27 (2005a)
Ghosh, A.K., Chaudhuri, P.: On maximum depth and related classifiers. Scand. J. Stat. 32, 327–350 (2005b)
Guha, P.: On scale-scale curves for multivariate data based on rank regions. Ph.D. thesis, University of Birmingham (2012)
Guha, P., Chakraborty, B.: On scale-scale plot for comparing multivariate distributions. Submitted (2013)
Haberman, S.J.: Generalized residuals for log-linear models. Proceedings of the 9th International Biometrics Conference, pp. 104–122. Boston (1976)
Hastie, T., Tibshirani, R., Friedman, J.H.: The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, New York (2001)
Jörnsten, R.: Clustering and Classification based on the L 1 data depth. J. Multivar. Anal. 90, 67–89 (2004)
Koltchinskii, V.I.: M-estimation, convexity and quantiles. Ann. Stat. 25, 435–477 (1997)
Lachenbruch, P.A., Mickey, M.R.: Estimation of error rates in discriminant analysis. Technometrics 10, 1–11 (1968)
Lange, T., Mosler, K., Mozharovskyi, P.: Fast nonparametric classification based on data depth. Stat. Pap. 55, 49–69 (2014)
Li, J., Cuesta-Albertos, J. A. and Liu, R. Y.: DD-classifier: Nonparametric classification procedure based on DD-plot. J. Am. Stat. Assoc. 107, 737–753 (2012)
Liu, R.Y.: Control charts for multivariate processes. J. Am. Stat. Assoc. 90, 1380–1387 (1995)
Liu, R.Y., Parelius, J.M., Singh, K.: Multivariate analysis by data depth: Descriptive statistics, graphics and inference. Ann. Stat. 27, 783–858 (1999)
Liu, R.Y., Singh, K.: A quality index based on multivariate data depth and multivariate rank tests. J. Am. Stat. Assoc. 88, 252–260 (1993)
Lohweg, V.: UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science (2013). http://mlr.cs.umass.edu/ml/datasets/banknote+authentication
Makinde, O.S.: On some classification methods for high dimensional and functional data. PhD Thesis, University of Birmingham (2014)
Marden, J.I.: Bivariate QQ-Plots and spider web plots. Stat. Sin. 8, 813–826 (1998)
McLachlan, G.J.: Discriminant Analysis and Statistical Pattern Recognition. Wiley, New York (1992)
Miller, A.J., Shaw, D.E., Veitch, L.G., Smith, E.J.: Analyzing the results of a cloud-seeding experiment in Tasmania. Commun. Stat.-Theory Methods 8, 1017–1047 (1979)
Möttönen, J., Oja, H.: Multivariate spatial sign and rank methods. J. Nonparametric Stat. 5, 201–213 (1995)
Möttönen, J., Oja, H., Tienari, J.: On the efficiency of multivariate spatial sign and rank tests. Ann. Stat. 25, 542–552 (1997)
Nakai, K.: UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science (1991). http://mlr.cs.umass.edu/ml/datasets/Yeast
Paindaveine, D., Van Bever, G.: Nonparametrically consistent depth-based classifiers. Bernoulli 21, 62–82 (2015)
Rousseeuw, P.J.: Least median of squares regression. J. Am. Stat. Assoc. 79, 871–880 (1984)
Rousseeuw, P.J., Hubert, M.: Regression depth. J. Am. Stat. Assoc. 94, 388–402 (1999)
Rousseeuw, P.J., Van Driessen, K.: A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223 (1999)
Serfling, R.: A depth function and a scale curve based on spatial quantiles. In: Dodge, Y. (ed.) Statistical Data Analysis Based on the L 1-Norm and Related Methods, pp. 25–28. Birkhaeuser, Boston (2002)
Serfling, R.: Equivariance and invariance properties of multivariate quantile and related functions, and the role of standardization. J. Nonparametric Stat. 22, 915–936 (2010)
Vapnik, V.N.: Estimation of Dependences Based on Empirical Data. Addendum 1. Springer, New York (1982)
Vapnik V.N.: Statistical Learning Theory. Wiley, New York (1998)
Vardi, Y., Zhang, C.H.: The multivariate L 1-median and associated data depth. Proc. Natl. Acad. Sci. 97, 1423–1426 (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Makinde, O.S., Chakraborty, B. (2015). On Some Nonparametric Classifiers Based on Distribution Functions of Multivariate Ranks. In: Nordhausen, K., Taskinen, S. (eds) Modern Nonparametric, Robust and Multivariate Methods. Springer, Cham. https://doi.org/10.1007/978-3-319-22404-6_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-22404-6_15
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22403-9
Online ISBN: 978-3-319-22404-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)