Behavior Research Methods

, Volume 50, Issue 1, pp 416–426 | Cite as

Classifiers as a model-free group comparison test



The conventional statistical methods to detect group differences assume correct model specification, including the origin of difference. Researchers should be able to identify a source of group differences and choose a corresponding method. In this paper, we propose a new approach of group comparison without model specification using classification algorithms in machine learning. In this approach, the classification accuracy is evaluated against a binomial distribution using Independent Validation. As an application example, we examined false-positive errors and statistical power of support vector machines to detect group differences in comparison to conventional statistical tests such as t test, Levene’s test, K-S test, Fisher’s z-transformation, and MANOVA. The SVMs detected group differences regardless of their origins (mean, variance, distribution shape, and covariance), and showed comparably consistent power across conditions. When a group difference originated from a single source, the statistical power of SVMs was lower than the most appropriate conventional test of the study condition; however, the power of SVMs increased when differences originated from multiple sources. Moreover, SVMs showed substantially improved performance with more variables than with fewer variables. Most importantly, SVMs were applicable to any types of data without sophisticated model specification. This study demonstrates a new application of classification algorithms as an alternative or complement to the conventional group comparison test. With the proposed approach, researchers can test two-sample data even when they are not certain which statistical test to use or when data violates the statistical assumptions of conventional methods.


Group comparison Classifiers Support vector machine K-fold cross validation Independent validation 


  1. Bennett, K. P., & Campbell, C. (2000). Support vector machines: Hype or hallelujah? ACM SIGKDD Explorations Newsletter, 2, 1–13.CrossRefGoogle Scholar
  2. Borders, A., Ertekin, S., Weston, J., & Bottou, L. (2005). Fast kernel classifiers with online and active learning. Journal of Machine Learning Research, 6, 1579–1619.Google Scholar
  3. Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. Boca Raton, FL: CRC Press.Google Scholar
  4. Brown, M. P., Grundy, W. N., Lin, D., Cristianini, N., Sugnet, C. W., Furey, T. S., Ares, M., & Haussler, D. (2000). Knowledge-based analysis of microarray gene expression data by using support vector machines. Proceedings of the National Academy of Sciences, 97, 262–267.CrossRefGoogle Scholar
  5. Che, D., Liu, Q., Rasheed, K., & Tao, X. (2011). Decision tree and ensemble learning algorithms with their applications in bioinformatics. In H.R. Arabnia, & Q.-N. Tran (Eds.), Software tools and algorithms for biological systems (pp. 191–199). New York, NY: Springer.Google Scholar
  6. Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.Google Scholar
  7. Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines and other kernel-based learning methods. Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
  8. Curran, P. J., West, S. G., & Finch, J. F. (1996). The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis. Psychological Methods, 1, 16–29.CrossRefGoogle Scholar
  9. Dietterich, T. G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10(7), 1895–1923.CrossRefPubMedGoogle Scholar
  10. Drucker, H., Wu, D., & Vapnik, V. N. (1999). Support vector machines for spam categorization. Neural Networks, 10, 1048–1054.CrossRefPubMedGoogle Scholar
  11. Dunn, O. J. (1961). Multiple comparisons among means. Journal of the American Statistical Association, 56 (293), 52–64.CrossRefGoogle Scholar
  12. Erceg-Hurn, D. M., & Mirosevich, V. M. (2008). Modern robust statistical methods: An easy way to maximize the accuracy and power of your research. American Psychologist, 63, 591–601.CrossRefPubMedGoogle Scholar
  13. Fisher, R. A. (1915). Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika, 10(4), 507–521.Google Scholar
  14. Flora, D. B., & Curran, P. J. (2004). An empirical evaluation of alternative methods of estimation for confirmatory factor analysis with ordinal data. Psychological Methods, 9, 466–491.CrossRefPubMedPubMedCentralGoogle Scholar
  15. Ganapathiraju, A., Hamaker, J. E., & Picone, J. (2004). Applications of support vector machines to speech recognition. IEEE Transactions on Signal Processing, 52, 2348–2355.CrossRefGoogle Scholar
  16. Garson, G. D. (1998). Neural networks: An introductory guide for social scientists. London, UK: Sage.Google Scholar
  17. Griffiths, M. D., Davies, M. N., & Chappell, D. (2004). Online computer gaming: A comparison of adolescent and adult gamers. Journal of Adolescence, 27, 87–96.CrossRefPubMedGoogle Scholar
  18. Han, B., & Davis, L. S. (2012). Density-based multifeature background subtraction with support vector machine. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(5), 1017–1023.CrossRefPubMedGoogle Scholar
  19. Hollander, M., Wolfe, D. A., & Chicken, E. (2013). Nonparametric statistical methods. Hoboken, NJ: John Wiley & Sons.Google Scholar
  20. Howlin, P., Mawhood, L., & Rutter, M. (2000). Autism and developmental receptive language disorder—a follow-up comparison in early adult life. ii: Social, behavioural, and psychiatric outcomes. Journal of Child Psychology and Psychiatry, 41, 561–578.CrossRefPubMedGoogle Scholar
  21. Hu, L.-T., Bentler, P. M., & Kano, Y. (1992). Can test statistics in covariance structure analysis be trusted? Psychological Bulletin, 112, 351–362.CrossRefPubMedGoogle Scholar
  22. Huang, C.-F. (2012). A hybrid stock selection model using genetic algorithms and support vector regression. Applied Soft Computing, 12(2), 807–818.CrossRefGoogle Scholar
  23. Indurkhya, N., & Damerau, F. J. (2012). Handbook of natural language processing Vol. 2. CRC Press: Boca Raton, FL.Google Scholar
  24. Inza, I., Calvo, B., Armañanzas, R., Bengoetxea, E., Larrañaga, P., & Lozano, J. A. (2010). Machine learning: An indispensable tool in bioinformatics. In R. Matthiesen (Ed.), Bioinformatics methods in clinical research, volume 593 of Methods in Molecular Biology (pp. 25–48). New York, NY: Humana Press.Google Scholar
  25. Jain, A. K., Duin, R. P. W., & Mao, J. (2000). Statistical pattern recognition: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 4–37.CrossRefGoogle Scholar
  26. Kanagawa, C., Cross, S. E., & Markus, H. R. (2001). Who am I? The cultural psychology of the conceptual self. Personality and Social Psychology Bulletin, 27, 90–103.CrossRefGoogle Scholar
  27. Kohavi, R., & et al. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In IJCAI (Vol. 14, pp. 1137–1145).Google Scholar
  28. Kotsiantis, S. B. (2007). Supervised machine learning: A review of classification techniques. Informatica, 31, 249–268.Google Scholar
  29. Lemm, S., Blankertz, B., Dickhaus, T., & Müller, K.-R. (2011). Introduction to machine learning for brain imaging. NeuroImage, 56, 387–399.CrossRefPubMedGoogle Scholar
  30. Levene, H. (1960). Robust tests for equality of variances1. Contributions to probability and statistics: Essays in honor of Harold Hotelling, 2, 278–292.Google Scholar
  31. Li, C.-H., Kuo, B.-C., Lin, C.-T., & Huang, C.-S. (2012). A spatial–contextual support vector machine for remotely sensed image classification. IEEE Transactions on Geoscience and Remote Sensing, 50(3), 784–799.CrossRefGoogle Scholar
  32. Massey, F. J. (1951). The Kolmogorov–Smirnov test for goodness of fit. Journal of the American statistical Association, 46(253), 68–78.CrossRefGoogle Scholar
  33. Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., & Leisch, F. (2015). e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien. R package version 1.6-7.Google Scholar
  34. Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105, 156–166.CrossRefGoogle Scholar
  35. Mohammed, A. A., Minhas, R., Jonathan Wu, Q., & Sid-Ahmed, M. A. (2011). Human face recognition based on multidimensional PCA and extreme learning machine. Pattern Recognition, 44, 2588–2597.CrossRefGoogle Scholar
  36. Mountrakis, G., Im, J., & Ogole, C. (2011). Support vector machines in remote sensing: A review. ISPRS Journal of Photogrammetry and Remote Sensing, 66(3), 247–259.CrossRefGoogle Scholar
  37. Noble, W. S. (2006). What is a support vector machine? Nature biotechnology, 24, 1565–1567.CrossRefPubMedGoogle Scholar
  38. Osuna, E., Freund, R., & Girosi, F. (1997). Training support vector machines: an application to face detection. In Proceedings 1997 IEEE computer society conference on computer vision and pattern recognition, 1997 (pp. 130–136). IEEE.Google Scholar
  39. Poldrack, R. A., Halchenko, Y. O., & Hanson, S. J. (2009). Decoding the large-scale structure of brain function by classifying mental states across individuals. Psychological Science, 20, 1364–1372.CrossRefPubMedPubMedCentralGoogle Scholar
  40. Pradhan, B. (2013). A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Computers & Geosciences, 51, 350–365.CrossRefGoogle Scholar
  41. Core Team, R (2016). R: A language and environment for statistical computing r foundation for statistical computing. Vienna, Austria.Google Scholar
  42. Rossi, J. (2013). Statistical power analysis. In I. B. Weiner, J. A. Schinka, & W. F. Velicer (Eds.), Handbook of psychology: Research methods in psychology, 2edn (pp. 71–108). Hoboken, NJ: Wiley.Google Scholar
  43. Rosten, E., Porter, R., & Drummond, T. (2010). Faster and better: A machine learning approach to corner detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 105–119.CrossRefPubMedGoogle Scholar
  44. Sabbagh, M. A., Xu, F., Carlson, S. M., Moses, L. J., & Lee, K. (2006). The development of executive functioning and theory of mind a comparison of Chinese and us preschoolers. Psychological Science, 17, 74–81.CrossRefPubMedPubMedCentralGoogle Scholar
  45. Saeys, Y., Wehenkel, L., Geurts, P., & et al. (2012). Statistical interpretation of machine learning-based feature importance scores for biomarker discovery. Bioinformatics, 28, 1766–1774.CrossRefPubMedGoogle Scholar
  46. Salzberg, S.L. (1997). On comparing classifiers: Pitfalls to avoid and a recommended approach. Data mining and Knowledge Discovery, 1, 317–328.CrossRefGoogle Scholar
  47. Serences, J. T., Ester, E. F., Vogel, E. K., & Awh, E. (2009). Stimulus-specific delay activity in human primary visual cortex. Psychological Science, 20, 207–214.CrossRefPubMedPubMedCentralGoogle Scholar
  48. Sha, F., & Saul, L. K. (2006). Large margin hidden Markov models for automatic speech recognition. In Advances in neural information processing systems (pp. 1249–1256).Google Scholar
  49. Stahl, D., Pickles, A., Elsabbagh, M., Johnson, M. H., Team, B., & et al. (2012). Novel machine learning methods for ERP analysis: A validation from research on infants at risk for autism. Developmental Neuropsychology, 37, 274–298.CrossRefPubMedGoogle Scholar
  50. Upstill-Goddard, R., Eccles, D., Fliege, J., & Collins, A. (2013). Machine learning approaches for the discovery of gene–gene interactions in disease data. Briefings in Bioinformatics, 14, 251–260.CrossRefPubMedGoogle Scholar
  51. Vapnik, V. N. (1998). Statistical learning theory Vol. 1. New York: Wiley.Google Scholar
  52. Vapnik, V. N. (2000). The nature of statistical learning theory. New York, NY: Springer.CrossRefGoogle Scholar
  53. Venables, W. N., & Ripley, B. D. (2002). Modern applied statistics with S, 4th edn. New York: Springer. ISBN 0-387-95457-0.CrossRefGoogle Scholar
  54. von Oertzen, T., & Kim, B. (under review). Independent validation remedies alpha inflation in classifier accuracy testing.Google Scholar
  55. Wang, J., Korczykowski, M., Rao, H., Fan, Y., Pluta, J., Gur, R. C., McEwen, B. S., & Detre, J. A. (2007). Gender difference in neural response to psychological stress. Social Cognitive and Affective Neuroscience, 2, 227–239.CrossRefPubMedPubMedCentralGoogle Scholar
  56. Wang, X., & Pardalos, P.M. (2015). A survey of support vector machines with uncertainties. Annals of Data Science, 1(3-4), 293–309.CrossRefGoogle Scholar
  57. Wilcox, R. R. (2012). Introduction to robust estimation and hypothesis testing. Academic Press.Google Scholar
  58. Yang, N., Chen, C. C., Choi, J., & Zou, Y. (2000). Sources of work–family conflict: A Sino–US comparison of the effects of work and family demands. Academy of Management Journal, 43, 113–123.CrossRefGoogle Scholar
  59. Yang, X.-S., Deb, S., & Fong, S. (2011). Accelerated particle swarm optimization and support vector machine for business optimization and applications. In Networked digital technologies (pp. 53–66) Springer.Google Scholar

Copyright information

© Psychonomic Society, Inc. 2017

Authors and Affiliations

  1. 1.Federal Reserve Bank of Kansas CityKansas CityUSA
  2. 2.University of the Federal Defense ForcesNeubibergGermany

Personalised recommendations