Skip to main content

Advertisement

Log in

Comparison of Feature Selection Techniques in Machine Learning for Anatomical Brain MRI in Dementia

  • Original Article
  • Published:
Neuroinformatics Aims and scope Submit manuscript

Abstract

We present a comparative split-half resampling analysis of various data driven feature selection and classification methods for the whole brain voxel-based classification analysis of anatomical magnetic resonance images. We compared support vector machines (SVMs), with or without filter based feature selection, several embedded feature selection methods and stability selection. While comparisons of the accuracy of various classification methods have been reported previously, the variability of the out-of-training sample classification accuracy and the set of selected features due to independent training and test sets have not been previously addressed in a brain imaging context. We studied two classification problems: 1) Alzheimer’s disease (AD) vs. normal control (NC) and 2) mild cognitive impairment (MCI) vs. NC classification. In AD vs. NC classification, the variability in the test accuracy due to the subject sample did not vary between different methods and exceeded the variability due to different classifiers. In MCI vs. NC classification, particularly with a large training set, embedded feature selection methods outperformed SVM-based ones with the difference in the test accuracy exceeding the test accuracy variability due to the subject sample. The filter and embedded methods produced divergent feature patterns for MCI vs. NC classification that suggests the utility of the embedded feature selection for this problem when linked with the good generalization performance. The stability of the feature sets was strongly correlated with the number of features selected, weakly correlated with the stability of classification accuracy, and uncorrelated with the average classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. This is akin to the implementation in the Donders Machine Learning Toolbox https://github.com/distrep/DMLT

  2. Briefly, as the LASSO does not enforce grouping, it is sometimes considered as inappropriate for neuroimaging applications (Carroll et al. 2009). The performance of EN-VA was very similar with EN-05 in the AD vs. NC problem. For these reasons, we decided not to perform the experiments for these methods for MCI vs. NC problem.

References

  • Ashburner, J., & Friston, K. (2005). Unified segmentation. Neuroimage, 26(3), 839–851.

    Article  PubMed  Google Scholar 

  • Baldassarre, L., Mourao-Miranda, J., & Pontil, M. (2012). Structured sparsity models for brain decoding from fmri data. In Pattern Recognition in NeuroImaging (PRNI), 2012 International Workshop on (pp. 5–8): IEEE.

  • Bouckaert, R.R., & Frank, E. (2004). Evaluating the replicability of significance tests for comparing learning algorithms. In Advances in knowledge discovery and data mining (pp. 3–12): Springer.

  • Bron, E.E., Smits, M., van der Flier, W.M., Vrenken, H., Barkhof, F., Scheltens, P., Papma, J.M., Steketee, R.M., Orellana, C.M., Meijboom, R., & et al. (2015). Standardized evaluation of algorithms for computer-aided diagnosis of dementia based on structural mri: The caddementia challenge. NeuroImage, 111, 562–579.

    Article  PubMed  Google Scholar 

  • Carroll, M.K., Cecchi, G.A., Rish, I., Garg, R., & Rao, A.R. (2009). Prediction and interpretation of distributed neural activity with sparse models. NeuroImage, 44(1), 112–122.

    Article  PubMed  Google Scholar 

  • Casanova, R., Whitlow, C.T., Wagner, B., Williamson, J., Shumaker, S.A., Maldjian, J.A., & Espeland, M.A. (2011b). High dimensional classification of structural mri alzheimer’s disease data based on large scale regularization. Frontiers in neuroinformatics 5.

  • Chang, C.C., & Lin, C.J. (2011). Libsvm: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3), 27.

    Google Scholar 

  • Chu, C., Hsu, A.L., Chou, K.H., Bandettini, P., Lin, C., Initiative, A.D.N., & et al. (2012). Does feature selection improve classification accuracy? impact of sample size and feature selection on classification using anatomical magnetic resonance images. Neuroimage, 60(1), 59–70.

    Article  PubMed  Google Scholar 

  • Cuadra, M.B., Cammoun, L., Butz, T., Cuisenaire, O., & Thiran, J.P. (2005). Comparison and validation of tissue modelization and statistical classification methods in t1-weighted mr brain images. IEEE Transactions on Medical Imaging, 24(12), 1548–1565.

    Article  PubMed  Google Scholar 

  • Cuingnet, R., Gerardin, E., Tessieras, J., Auzias, G., Lehéricy, S, Habert, M.O., Chupin, M., Benali, H., & Colliot, O. (2011). Automatic classification of patients with alzheimer’s disease from structural mri: a comparison of ten methods using the adni database. Neuroimage, 56(2), 766–781.

    Article  PubMed  Google Scholar 

  • Cuingnet, R., Glaunès, J.A., Chupin, M., Benali, H., & Colliot, O. (2013). Spatial and anatomical regularization of svm: a general framework for neuroimaging data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(3), 682–696.

    Article  PubMed  Google Scholar 

  • Dalton, L.A., & Dougherty, E.R. (2011). Bayesian minimum mean-square error estimation for classification error—part II: The Bayesian MMSE error estimator for linear classification of Gaussian distributions. IEEE Trans Signal Process, 59(1), 130–144.

    Article  Google Scholar 

  • Davis, T., LaRocque, K.F., Mumford, J.A., Norman, K.A., Wagner, A.D., & Poldrack, R.A. (2014). What do differences between multi-voxel and univariate analysis mean? how subject-, voxel-, and trial-level variance impact fmri analysis. NeuroImage, 97, 271–283.

    Article  PubMed  PubMed Central  Google Scholar 

  • Dice, L.R. (1945). Measures of the amount of ecologic association between species. Ecology, 26(3), 297–302.

    Article  Google Scholar 

  • Dietterich, T.G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural computation, 10(7), 1895–1923.

    Article  PubMed  Google Scholar 

  • Dougherty, E.R., Sima, C., Hanczar, B., & Braga-Neto, U.M. (2010). Performance of error estimators for classification. Current Bioinformatics, 5(1), 53.

    Article  CAS  Google Scholar 

  • Dubuisson, M.P., & Jain, A.K. (1994). A modified hausdorff distance for object matching. In Pattern Recognition, 1994. Vol. 1-Conference A: Computer Vision & Image Processing., Proceedings of the 12th IAPR International Conference on, (Vol. 1 pp. 566–568): IEEE.

  • Dukart, J., Schroeter, M.L., & Mueller, K. (2011). Age correction in dementia–matching to a healthy brain. PloS one, 6(7), e22–193.

    Article  Google Scholar 

  • Fiot, J.B., Raguet, H., Risser, L., Cohen, L.D., Fripp, J., & Vialard, F.X. (2014). Longitudinal deformation models, spatial regularizations and learning strategies to quantify alzheimer’s disease progression. NeuroImage: Clinical, 4, 718–729.

    Article  Google Scholar 

  • Fjell, A.M., McEvoy, L., Holland, D., Dale, A.M., Walhovd, K.B., & et al. (2013). Brain changes in older adults at very low risk for alzheimer’s disease. The Journal of Neuroscience, 33(19), 8237–8242.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Franke, K., Ziegler, G., Klöppel, S., & Gaser, C. (2010). Estimating the age of healthy subjects from t1-weighted mri scans using kernel methods: Exploring the influence of various parameters. Neuroimage, 50(3), 883–892.

    Article  PubMed  Google Scholar 

  • Franke, K., Ristow, M., Gaser, C., Initiative, A.D.N., & et al. (2014). Gender-specific impact of personal health parameters on individual brain aging in cognitively unimpaired elderly subjects. Frontiers in Aging Neuroscience, 6(94).

  • Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1–22.

    Article  PubMed  PubMed Central  Google Scholar 

  • Gaser, C. (2009). Partial volume segmentation with adaptive maximum a posteriori (map) approach. NeuroImage, 47, S121.

    Article  Google Scholar 

  • Gaser, C., Franke, K., Klöppel, S., Koutsouleris, N., Sauer H, & Initiative, A.D.N. (2013). Brainage in mild cognitive impaired patients: Predicting the conversion to alzheimer’s disease. PloS one, 8(6), e67–346.

    Article  Google Scholar 

  • Genovese, C.R., Lazar, N.A., & Nichols, T. (2002). Thresholding of statistical maps in functional neuroimaging using the false discovery rate. Neuroimage, 15(4), 870–878.

    Article  PubMed  Google Scholar 

  • Glick, N. (1978). Additive estimators for probabilities of correct classification. Pattern Recognition, 10(3), 211–222.

    Article  Google Scholar 

  • Grosenick, L., Greer, S., & Knutson, B. (2008). Interpretable classifiers for fmri improve prediction of purchases. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 16(6), 539–548.

    Article  PubMed  Google Scholar 

  • Grosenick, L., Klingenberg, B., Katovich, K.B.K., & Taylor, J.E. (2013). Interpretable whole-brain prediction analysis with graphnet. NeuroImage, 72, 304–321.

    Article  PubMed  Google Scholar 

  • Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. The Journal of Machine Learning Research, 3, 1157–1182.

    Google Scholar 

  • Hastie, T., Rosset, S., Tibshirani, R., & Zhu, J. (2004). The entire regularization path for the support vector machine. The Journal of Machine Learning Research, 5, 1391–1415.

    Google Scholar 

  • Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning, 2nd: Springer series in statistics.

  • Haufe, S., Meinecke, F., Görgen, K., Dähne, S., Haynes, J.D., Blankertz, B., & Bießmann, F. (2014). On the interpretation of weight vectors of linear models in multivariate neuroimaging. Neuroimage, 87, 96–110.

    Article  PubMed  Google Scholar 

  • Huttunen, H., & Tohka, J. (2015). Model selection for linear classifiers using bayesian error estimation. Pattern Recognition, 48, 3739–3748.

    Article  Google Scholar 

  • Huttunen, H., Manninen, T., & Tohka, J. (2012). Mind reading with multinomial logistic regression: Strategies for feature selection, (pp. 42–49). Helsinki, Finland: Federated Computer Science Event.

    Google Scholar 

  • Huttunen, H., Manninen, T., Kauppi, J.P., & Tohka, J. (2013a). Mind reading with regularized multinomial logistic regression. Machine Vision and Applications, 24(6), 1311–1325.

    Article  Google Scholar 

  • Huttunen, H., Manninen, T., & Tohka, J. (2013b). Bayesian error estimation and model selection in sparse logistic regression. In 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP) (pp. 1–6): IEEE.

  • Inza, I., Larrañaga, P., Blanco, R., & Cerrolaza, A.J. (2004). Filter versus wrapper gene selection approaches in dna microarray domains. Artificial Intelligence in Medicine, 31(2), 91–103.

    Article  PubMed  Google Scholar 

  • Jimura, K., & Poldrack, R.A. (2012). Analyses of regional-average activation and multivoxel pattern information tell complementary stories. Neuropsychologia, 50(4), 544–552.

    Article  PubMed  Google Scholar 

  • Kenny, D. (1987). Statistics for the Social and Behavioral Sciences: Little Brown.

  • Kerr, W.T., Douglas, P.K., Anderson, A., & Cohen, M.S. (2014). The utility of data-driven feature selection: Re: Chu et al. 2012. NeuroImage, 84, 1107–1110.

    Article  PubMed  Google Scholar 

  • Khundrakpam, B.S., Tohka, J., & Evans, A.C. (2015). Prediction of brain maturity based on cortical thickness at different spatial resolutions. NeuroImage, 111, 350–359.

    Article  PubMed  Google Scholar 

  • Klöppel, S., Peter, J., Ludl, A., Pilatus, A., Maier, S., Mader, I., Heimbach, B., Frings, L., Egger, K., Dukart, J., & et al. (2015). Applying automated mr-based diagnostic methods to the memory clinic: A prospective study. Journal of Alzheimer’s Disease, 47, 939–954.

    Article  PubMed  Google Scholar 

  • Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In International Joint Conference on Artificial Intelligence (IJCAI95), (Vol. 14 pp. 1137–1145).

  • Lazar, N.A., Luna, B., Sweeney, J.A., & Eddy, W.F. (2002). Combining brains: a survey of methods for statistical pooling of information. Neuroimage, 16(2), 538–550.

    Article  PubMed  Google Scholar 

  • Meinshausen, N., & Bühlmann, P. (2010). Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4), 417–473.

    Article  Google Scholar 

  • Michel, V., Gramfort, A., Varoquaux, G., Eger, E., & Thirion, B. (2011). Total variation regularization for fmri-based prediction of behavior. IEEE Transactions on Medical Imaging, 30(7), 1328–1340.

    Article  PubMed  PubMed Central  Google Scholar 

  • Mohr, H., Wolfensteller, U., Frimmel, S., & Ruge, H. (2015). Sparse regularization techniques provide novel insights into outcome integration processes. NeuroImage, 104, 163–176.

    Article  PubMed  Google Scholar 

  • Moradi, E., Gaser, C., & Tohka, J. (2014). Semi-supervised learning in mci-to-ad conversion prediction - when is unlabeled data useful IEEE Pattern Recognition in Neuro Imaging, 121–124.

  • Moradi, E., Pepe, A., Gaser, C., Huttunen, H., & Tohka, J. (2015). Machine learning framework for early mri-based alzheimer’s conversion prediction in mci subjects. NeuroImage, 104, 398–412.

    Article  PubMed  Google Scholar 

  • Mwangi, B., Tian, T.S., & Soares, J.C. (2014). A review of feature reduction techniques in neuroimaging. Neuroinformatics, 12(2), 229–244.

    Article  PubMed  PubMed Central  Google Scholar 

  • Nadeau, C., & Bengio, Y. (2003). Inference for the generalization error. Machine Learning, 52(3), 239–281.

    Article  Google Scholar 

  • Pajula, J., Kauppi, J.P., & Tohka, J. (2012). Inter-subject correlation in fmri: method validation against stimulus-model based analysis. PloS one, 7(8), e41–196.

    Google Scholar 

  • Petersen, R., Aisen, P., Beckett, L., Donohue, M., Gamst, A., Harvey, D., Jack, C., Jagust, W., Shaw, L., Toga, A., & et al. (2010). Alzheimer’s disease neuroimaging initiative (adni) clinical characterization. Neurology, 74(3), 201–209.

    Article  PubMed  PubMed Central  Google Scholar 

  • Rajapakse, J.C., Giedd, J.N., & Rapoport (1997). Statistical approach to segmentation of single-channel cerebral mr images. IEEE Transactions on Medical Imaging, 16(2), 176–186.

    Article  CAS  PubMed  Google Scholar 

  • Rasmussen, P.M., Hansen, L.K., Madsen, K.H., Churchill, N.W., & Strother, S.C. (2012). Model sparsity and brain pattern interpretation of classification models in neuroimaging. Pattern Recognition, 45(6), 2085–2100.

    Article  Google Scholar 

  • Retico, A, Bosco, P, Cerello, P, Fiorina, E, Chincarini, A, & Fantacci, ME. (2015). Predictive models based on support vector machines: Whole-brain versus regional analysis of structural mri in the alzheimer’s disease: Journal of Neuroimaging (in press).

  • Rondina, J.M., Hahn, T., De Oliveira, L., Marquand, A.F., Dresler, T., Leitner, T., Fallgatter, A.J., Shawe-Taylor, J., & Mourao-Miranda, J. (2014). Scors–a method based on stability for feature selection and mapping in neuroimaging. IEEE Transactions on Medical Imaging, 33(1), 85–98.

    Article  PubMed  Google Scholar 

  • Ryali, S., Supekar, K., Abrams, D.A., & Menon, V. (2010). Sparse logistic regression for whole-brain classification of fmri data. NeuroImage, 51(2), 752–764.

    Article  PubMed  PubMed Central  Google Scholar 

  • Sabuncu, M.R., Konukoglu, E., Initiative, A.D.N., & et al. (2015). Clinical prediction from structural brain mri scans: A large-scale empirical study. Neuroinformatics, 13, 31–46.

    Article  PubMed  PubMed Central  Google Scholar 

  • Strother, S.C., Anderson, J., Hansen, L.K., Kjems, U., Kustra, R., Sidtis, J., Frutiger, S., Muley, S., LaConte, S., & Rottenberg, D. (2002). The quantitative evaluation of functional neuroimaging experiments: the npairs data analysis framework. NeuroImage, 15(4), 747–771.

    Article  PubMed  Google Scholar 

  • Tibshirani, R. (1996). Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society Series B, 58, 267–288.

    Google Scholar 

  • Tohka, J., Zijdenbos, A., & Evans, A. (2004). Fast and robust parameter estimation for statistical partial volume models in brain mri. Neuroimage, 23(1), 84–97.

    Article  PubMed  Google Scholar 

  • Van Gerven, M.A., Cseke, B., De Lange, F.P., & Heskes, T. (2010). Efficient bayesian multivariate fmri analysis using a sparsifying spatio-temporal prior. NeuroImage, 50(1), 150–161.

    Article  PubMed  Google Scholar 

  • Weiner, M., Veitch, D.P., Aisen, P.S., Beckett, L.A., Cairns, N.J., & et al. (2012). The alzheimer’s disease neuroimaging initiative: A review of paper published since its inception. Alzheimers & Dementia, 8(1), S1–S68.

    Article  Google Scholar 

  • Ye, J., Farnum, M., Yang, E., Verbeeck, R., Lobanov, V., Raghavan, N., Novak, G., Dibernardo, A., & Narayan, V. (2012). Sparse learning and stability selection for predicting mci to ad conversion using baseline adni data. BMC Neurology, 12(46), 1–12.

    Google Scholar 

  • Zijdenbos, A.P., Dawant, B.M., Margolin, R.A., & Palmer, A.C. (1994). Morphometric analysis of white matter lesions in mr images: method and validation. IEEE Transactions on Medical Imaging, 13(4), 716–724.

    Article  CAS  PubMed  Google Scholar 

  • Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320.

    Article  Google Scholar 

Download references

Acknowledgments

Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimers Association; Alzheimers Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Disease Cooperative Study at the University of California, San Diego. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.

This project has received funding from the Universidad Carlos III de Madrid, the European Unions Seventh Framework Programme for research, technological development and demonstration under grant agreement nr 600371, el Ministerio de Economía y Competitividad (COFUND2013-40258) and Banco Santander.

We also acknowledge CSC – IT Center for Science Ltd., Finland, for the allocation of computational resources.

Author information

Authors and Affiliations

Authors

Consortia

Corresponding author

Correspondence to Jussi Tohka.

Ethics declarations

Conflict of interests

No conflicts of interest exist for any of the named authors in this study.

Additional information

Alzheimer’s Disease Neuroimaging Initiative (ADNI) is a Group/Institutional Author

Data used in preparation of this article were obtained from the Alzheimers Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 167 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tohka, J., Moradi, E., Huttunen, H. et al. Comparison of Feature Selection Techniques in Machine Learning for Anatomical Brain MRI in Dementia. Neuroinform 14, 279–296 (2016). https://doi.org/10.1007/s12021-015-9292-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12021-015-9292-3

Keywords

Navigation