Machine Vision and Applications

, Volume 24, Issue 6, pp 1311–1325 | Cite as

Mind reading with regularized multinomial logistic regression

  • Heikki Huttunen
  • Tapio Manninen
  • Jukka-Pekka Kauppi
  • Jussi Tohka
Original Paper

Abstract

In this paper, we consider the problem of multinomial classification of magnetoencephalography (MEG) data. The proposed method participated in the MEG mind reading competition of ICANN’11 conference, where the goal was to train a classifier for predicting the movie the test person was shown. Our approach was the best among ten submissions, reaching accuracy of 68 % of correct classifications in this five category problem. The method is based on a regularized logistic regression model, whose efficient feature selection is critical for cases with more measurements than samples. Moreover, a special attention is paid to the estimation of the generalization error in order to avoid overfitting to the training data. Here, in addition to describing our competition entry in detail, we report selected additional experiments, which question the usefulness of complex feature extraction procedures and the basic frequency decomposition of MEG signal for this application.

Keywords

Logistic regression Elastic net regularization Classification Decoding Magnetoencephalography Natural stimulus 

References

  1. 1.
    Anderson, J., Blair, V.: Penalized maximum likelihood estimation in logistic regression and discrimination. Biometrika 69, 123–136 (1982)MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Besserve, M., Jerbi, K., Laurent, F., Baillet, S., Martinerie, J., Garnero, L.: Classification methods for ongoing EEG and MEG signals. Biol. Res. 40(4), 415–437 (2007)Google Scholar
  3. 3.
    Blankertz, B., Müller, K.R., Krusienski, D.J., Schalk, G., Wolpaw, J.R., Schlögl, A., del Pfurtscheller, G., RMillán, J., Schröder, M., Birbaumer, N.: The BCI competition III: validating alternative approaches to actual BCI problems. IEEE Trans. Neural Syst. Rehabil. Eng. 14(2), 153–159 (2006)CrossRefGoogle Scholar
  4. 4.
    Blankertz, B., Tangermann, M., Vidaurre, C., Fazli, S., Sannelli, C., Haufe, S., Maeder, C., Ramsey, L., Sturm, I., Curio, G., Müller, K.R.: The Berlin brain-computer interface: non-medical uses of BCI technology. Front Neurosci. 4, 198 (2010)CrossRefGoogle Scholar
  5. 5.
    Carroll, M.K., Cecchi, G.A., Rish, I., Garg, R., Rao, A.R.: Prediction and interpretation of distributed neural activity with sparse models. Neuroimage 44(1), 112–122 (2009)CrossRefGoogle Scholar
  6. 6.
    Chan, A.M., Halgren, E., Marinkovic, K., Cash, S.S.: Decoding word and category-specific spatiotemporal representations from MEG and EEG. Neuroimage 54(4), 3028–3039 (2011)CrossRefGoogle Scholar
  7. 7.
    Debuse, J.C., Rayward-Smith, V.J.: Feature subset selection within a simulated annealing data mining algorithm. J. Intell. Inf. Syst. 9, 57–81 (1997)Google Scholar
  8. 8.
    Dougherty, E.R., Sima, C., Hua, J., Hanczar, B., Braga-Neto, U.M.: Performance of error estimators for classification. Curr. Bioinf. 5(1), 53–67 (2010)CrossRefGoogle Scholar
  9. 9.
    Friedman, J.H., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010)Google Scholar
  10. 10.
    Grosenick, L., Greer, S., Knutson, B.: Interpretable classifiers for FMRI improve prediction of purchases. IEEE Trans. Neural Syst. Rehabil. Eng. 16(6), 539–548 (2008)CrossRefGoogle Scholar
  11. 11.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature seletion. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATHGoogle Scholar
  12. 12.
    Hand, D.J.: Classifier technology and the illusion of progress. Stat. Sci. 21(1), 1–14 (2006). http://www.jstor.org/stable/27645729
  13. 13.
    Hanke, M., Halchenko, Y.O., Sederberg, P.B., Olivetti, E., Fründ, I., Rieger, J.W., Herrmann, C.S., Haxby, J.V., Hanson, S.J., Pollmann, S.: PyMVPA: a unifying approach to the analysis of neuroscientific data. Front Neuroinf. 3, 3 (2009)Google Scholar
  14. 14.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2ndedn. Springer Series in Statistics. Springer (2009)Google Scholar
  15. 15.
    Haxby, J.V., Gobbini, M.I., Furey, M.L., Ishai, A., Schouten, J., Pietrini, P.: Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293(5539), 2425–2430 (2001)CrossRefGoogle Scholar
  16. 16.
    Haynes, J.D.: Multivariate decoding and brain reading: introduction to the special issue. NeuroImage 56(2), 385–386 (2011)CrossRefGoogle Scholar
  17. 17.
    Haynes, J.D., Rees, G.: Predicting the orientation of invisible stimuli from activity inhuman primary visual cortex. Nat. Neurosci. 8(5), 686–691 (2005)CrossRefGoogle Scholar
  18. 18.
    Holte, R.C.: Elaboration on two points raised in “classifier technology and the illusion of progress”. Stat. Sci. 21(1), 24–26 (2006). http://www.jstor.org/stable/27645732
  19. 19.
    Huttunen, H., Kauppi, J.P., Tohka, J.: Regularized logistic regression for mind reading with parallel validation. In: Proceedings of ICANN/PASCAL2 Challenge: MEG Mind-Reading, pp. 20–24 (2011). http://www.cis.hut.fi/icann2011/meg/megicann_proceedings.pdf
  20. 20.
    Huttunen, H., Manninen, T., Tohka, J.: MEG mind reading: Strategies for feature selection. In: Proceedings of the Federated Computer Science Event 2012, pp. 42–49 (2012). http://www.cs.helsinki.fi/u/starkoma/ytp/YTP-Proceedings-2012.pdf
  21. 21.
    Jylänki, P., Riihimäki, J., Vehtari, A.: Multi-class Gaussian process classification of single trial MEG based on frequency specific latent features extracted with binary linear classifiers. In: Proceedings of ICANN/PASCAL2 Challenge: MEG Mind-Reading, pp. 31–34 (2011). http://www.cis.hut.fi/icann2011/meg/megicann_proceedings.pdf
  22. 22.
    Kamitani, Y., Tong, F.: Decoding the visual and subjective contents of the human brain. Nat. Neurosci. 8(5), 679–685 (2005)CrossRefGoogle Scholar
  23. 23.
    Kauppi, J.P., Huttunen, H., Korkala, H., Jääskeläinen, I.P., Sams, M., Tohka, J.: Face prediction from fMRI data during movie stimulus: strategies for feature selection. In: Proceedings of ICANN 2011. Lecture Notes in Computer Science, Vol. 6792, pp. 189–196. Springer (2011)Google Scholar
  24. 24.
    Kippenhan, J.S., Barker, W.W., Pascal, S., Nagel, J., Duara, R.: Evaluation of a neural-network classifier for pet scans of normal and alzheimer’s disease subjects. J. Nucl. Med. 33(8), 1459–1467 (1992)Google Scholar
  25. 25.
    Klami, A., Ramkumar, P., Virtanen, S., Parkkonen, L., Hari, R., Kaski, S.: ICANN/PASCAL2 Challenge: MEG Mind-Reading—Overview and Results (2011). http://www.cis.hut.fi/icann2011/meg/megicann_proceedings.pdf
  26. 26.
    Kleinbaum, D., Klein, M.: Logistic Regression. Statistics for Biology and Health. Springer, New York (2010)Google Scholar
  27. 27.
    Lautrup, B., Hansen, L., Law, I., Mørch, N., Svarer, C., Strother, S.: Massive weight sharing: a cure for extremely ill-posed problems. In: Supercomputing in Brain Research: From Tomography to, Neural Networks, pp. 137–148 (1994)Google Scholar
  28. 28.
    Lilliefors, H.W.: On the Kolmogorov–Smirnov test for normality with mean and variance unknown. J. Am. Stat. Assoc. 62(318), 399–402 (1967)CrossRefGoogle Scholar
  29. 29.
    Lotte, F., Congedo, M., Lécuyer, A., Lamarche, F., Arnaldi, B.: A review of classification algorithms for EEG-based brain–computer interfaces. J. Neural Eng. 4(2), R1 (2007)Google Scholar
  30. 30.
    Mar, R.: The neuropsychology of narrative: story comprehension, story production and their interrelation. Neuropsychologia 42(10), 1414–1434 (2004)CrossRefGoogle Scholar
  31. 31.
    Mørch, N., Hansen, L.K., Strother, S.C., Svarer, C., Rottenberg, D.A., Lautrup, B., Savoy, R., Paulson, O.B.: Nonlinear versus linear models in functional neuroimaging: learning curves and generalization crossover. In: Proceedings of the 15th International Conference on Information Processing in Medical Imaging. Lecture Notes in Computer Science, vol. 1230, pp. 259–270 (1997)Google Scholar
  32. 32.
    Naselaris, T., Kay, K.N., Nishimoto, S., Gallant, J.L.: Encoding and decoding in fMRI. NeuroImage 56(2), 400–410 (2011)CrossRefGoogle Scholar
  33. 33.
    Nickels, L.: The hypothesis testing approach to the assesment of language. In: Stremmer, B., Whitaker, H. (eds.) The Handbook of Neuroscience of Language. Academic press (2008)Google Scholar
  34. 34.
    Olsson, C.J., Jonsson, B., Larsson, A., Nyberg, L.: Motor representations and practice affect brain systems underlying imagery: an fMRI study of internal imagery in novices and active high jumpers. Open Neuroimaging J. 2, 5–13 (2008)CrossRefGoogle Scholar
  35. 35.
    O’Toole, A.J., Jiang, F., Abdi, H., Pénard, N., Dunlop, J.P., Parent, M.A.: Theoretical, statistical, and practical perspectives on pattern-based classification approaches to the analysis of functional neuroimaging data. J. Cogn. Neurosci. 19(11), 1735–1752 (2007)CrossRefGoogle Scholar
  36. 36.
    Pereira, F., Botvinick, M.: Information mapping with pattern classifiers: a comparative study. Neuroimage 56(2), 476–496 (2011). doi: 10.1016/j.neuroimage.2010.05.026 Google Scholar
  37. 37.
    Pereira, F., Mitchell, T., Botvinick, M.: Machine learning classifiers and fMRI: a tutorial overview. NeuroImage 45(Suppl 1), S199–S209 (2009)CrossRefGoogle Scholar
  38. 38.
    Pfurtscheller, G., Lopes da Silva, F.H.: Event-related EEG/MEG synchronization and desynchronization: basic principles. Clin. Neurophysiol. 110(11), 1842–1857 (1999)CrossRefGoogle Scholar
  39. 39.
    Poldrack, R.A., Halchenko, Y.O., Hanson, S.J.: Decoding the large-scale structure of brain function by classifying mental states across individuals. Psychol. Sci. 20(11), 1364–1372 (2009)CrossRefGoogle Scholar
  40. 40.
    Pudil, P., Novovičová, J., Kittler, J.: Floating search methods in feature selection. Pattern Recogn. Lett. 15(11), 1119–1125 (1994)CrossRefGoogle Scholar
  41. 41.
    Rasmussen, P.M., Hansen, L.K., Madsen, K.H., Churchill, N.W., Strother, S.C.: Model sparsity and brain pattern interpretation of classification models in neuroimaging. Pattern Recognit. 45(6), 2085–2100 (2012)CrossRefGoogle Scholar
  42. 42.
    Rasmussen, P.M., Madsen, K.H., Lund, T.E., Hansen, L.K.: Visualization of nonlinear kernel models in neuroimaging by sensitivity maps. NeuroImage 55(3), 1120–1131 (2011)CrossRefGoogle Scholar
  43. 43.
    Rieger, J.W., Reichert, C., Gegenfurtner, K.R., Noesselt, T., Braun, C., Heinze, H.J., Kruse, R., Hinrichs, H.: Predicting the recognition of natural scenes from single trial MEG recordings of brain activity. Neuroimage 42(3), 1056–1068 (2008)CrossRefGoogle Scholar
  44. 44.
    Santana, R., Bielza, C., Larranaga, P.: An ensemble of classifiers approach with multiple sources of information. In: Proceedings of ICANN/PASCAL2 Challenge: MEG Mind-Reading, pp. 25–30 (2011). http://www.cis.hut.fi/icann2011/meg/megicann_proceedings.pdf
  45. 45.
    Stam, C.: Use of magnetoencephalography (MEG) to study functional brain networks in neurodegenerative disorders. J. Neurol. Sci. 289(1–2), 128–134 (2010)CrossRefGoogle Scholar
  46. 46.
    Tangermann, M., Müller, K.R., Aertsen, A., Birbaumer, N., Braun, C., Brunner, C., Leeb, R., Mehring, C., Miller, K.J., Mueller-Putz, G., Nolte, G., Pfurtscheller, G., Preissl, H., Schalk, G., Schlögl, A., Vidaurre, C., Waldert, S., Blankertz, B.: Review of the BCI competition IV. Front. Neurosci. 6(55), 1–31 (2012)Google Scholar
  47. 47.
    Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1994)MathSciNetGoogle Scholar
  48. 48.
    Tomioka, R., Müller, K.R.: A regularized discriminative framework for EEG analysis with application to brain-computer interface. NeuroImage 49(1), 415–432 (2010)CrossRefGoogle Scholar
  49. 49.
    van De Ville, D., Lee, S.W.: Brain decoding: opportunities and challenges for pattern recognition. Pattern Recognit. Spec. Issue Brain Decod. 45(6), 2033–2034 (2012)CrossRefMATHGoogle Scholar
  50. 50.
    van Gerven, M., Hesse, C., Jensen, O., Heskes, T.: Interpreting single trial data using groupwise regularisation. Neuroimage 46, 665–676 (2009)Google Scholar
  51. 51.
    Waldert, S., Preissl, H., Demandt, E., Braun, C., Birbaumer, N., Aertsen, A., Mehring, C.: Hand movement direction decoded from MEG and EEG. J. Neurosci. 28(4), 1000–1008 (2008)CrossRefGoogle Scholar
  52. 52.
    Webb, A.: Statistical Pattern Recognition, 2nd edn. John Wiley& Sons, Chichester, England (2002)Google Scholar
  53. 53.
    Zhdanov, A., Hendler, T., Ungerleider, L., Intrator, N.: Inferring functional brain states using temporal evolution of regularized classifiers. Comput. Intell. Neurosci. 2007 (2007)Google Scholar
  54. 54.
    Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Heikki Huttunen
    • 1
  • Tapio Manninen
    • 1
  • Jukka-Pekka Kauppi
    • 2
  • Jussi Tohka
    • 1
  1. 1.Department of Signal ProcessingTampere University of TechnologyTampereFinland
  2. 2.Department of Computer Science and HIITUniversity of HelsinkiHelsinkiFinland

Personalised recommendations