Finding Effective Ways to (Machine) Learn fMRI-Based Classifiers from Multi-site Data

Vega, Roberto; Greiner, Russ

doi:10.1007/978-3-030-02628-8_4

Roberto Vega²⁸ &
Russ Greiner²⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11038))

Included in the following conference series:

1407 Accesses
1 Citations

Abstract

Machine learning techniques often require many training instances to find useful patterns, especially when the signal is subtle in high-dimensional data. This is especially true when seeking classifiers of psychiatric disorders, from fMRI (functional magnetic resonance imaging) data. Given the relatively small number of instances available at any single site, many projects try to use data from multiple sites. However, forming a dataset by simply concatenating the data from the various sites, often fails, due to batch effects – that is, the accuracy of a classifier learned from such a multi-site datasets, is often worse than of a classifier learned from a single site. We show why several simple, commonly used, techniques – such as including the site as a covariate, z-score normalization, or whitening – are useful only in very restrictive cases. Additionally, we propose an evaluation methodology to measure the impact of batch effects in classification studies and propose a technique for solving batch effects under the assumption that they are caused by a linear transformation. We empirically show that this approach consistently improve the performance of classifiers in multi-site scenarios, and presents more stability than the other approaches analyzed.

Supported by the Mexican National Council of Science and Technology (CONACYT), Canada’s Natural Science and Engineering Research Council (NSERC) and the Alberta Machine Intelligence Institute (AMII).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abraham, A., et al.: Deriving reproducible biomarkers from multi-site resting-state data: an Autism-based example. NeuroImage 147, 736–745 (2017)
Article Google Scholar
Arbabshirani, M.R., Plis, S., Sui, J., Calhoun, V.D.: Single subject prediction of brain disorders in neuroimaging: promises and pitfalls. Neuroimage 145, 137–165 (2016)
Article Google Scholar
Brown, M.R.G., et al.: ADHD-200 Global Competition: diagnosing ADHD using personal characteristic data can outperform resting state fMRI measurements. Front. Syst. Neurosci. 6, 69 (2012)
Article Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011)
Article Google Scholar
Csurka, G.: Domain adaptation for visual applications: a comprehensive survey. arXiv preprint arXiv:1702.05374 (2017)
Gheiratmand, M., et al.: Learning stable and predictive network-based patterns of schizophrenia and its clinical symptoms. NPJ Schizophr. 3, 22 (2017)
Article Google Scholar
Greve, D.N., Brown, G.G., Mueller, B.A., Glover, G., Liu, T.T.: A survey of the sources of noise in fMRI. Psychometrika 78(3), 396–416 (2013)
Article MathSciNet Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. SSS. Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7
Book MATH Google Scholar
Keator, D.B., et al.: The function biomedical informatics research network data repository. NeuroImage 124, Part B, 1074–1079 (2016). Sharing the wealth: Brain Imaging Repositories in 2015
Article Google Scholar
Kessy, A., Lewin, A., Strimmer, K.: Optimal whitening and decorrelation (2015)
Google Scholar
Nielsen, J.A., et al.: Multisite functional connectivity MRI classification of autism: ABIDE results. Front. Hum. Neurosci. 7, 599 (2013)
Article Google Scholar
Olivetti, E., Greiner, S., Avesani, P.: ADHD diagnosis from multiple data sources with batch effects. Front. Syst. Neurosci. 6, 70 (2012)
Article Google Scholar
Power, J.D., et al.: Functional network organization of the human brain. Neuron 72(4), 665–678 (2011)
Article Google Scholar
Quinonero-Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N.D.: When training and test sets are different: characterizing learning transfer (2012)
Google Scholar
Richiardi, J., Achard, S., Bunke, H., Van De Ville, D.: Machine learning with brain graphs: predictive modeling approaches for functional imaging in systems neuroscience. IEEE Signal Process. Mag. 30(3), 58–70 (2013)
Article Google Scholar
Vega Romero, R.I.: The challenge of applying machine learning techniques to diagnose schizophrenia using multi-site fMRI data (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Alberta, Edmonton, AB, T6G 2R3, Canada
Roberto Vega & Russ Greiner

Authors

Roberto Vega
View author publications
You can also search for this author in PubMed Google Scholar
Russ Greiner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Roberto Vega .

Editor information

Editors and Affiliations

University College London, London, UK
Danail Stoyanov
University of Leeds, Leeds, UK
Zeike Taylor
Radboud University Medical Center, Nijmegen, The Netherlands
Seyed Mostafa Kia
Vanderbilt University, Nashville, TN, USA
Ipek Oguz
University of Bern, Bern, Switzerland
Mauricio Reyes
Sunnybrook Research Institute, Toronto, ON, Canada
Anne Martel
German Cancer Research Center (DKFZ), Heidelberg, Germany
Lena Maier-Hein
Radboud University Nijmegen Medical Center, Nijmegen, The Netherlands
Andre F. Marquand
NeuroSpin, CEA Saclay, Gif-sur-Yvette, France
Edouard Duchesnay
Umeå University, Umeå, Sweden
Tommy Löfstedt
Vanderbilt University, Nashville, TN, USA
Bennett Landman
King's College London, London, UK
M. Jorge Cardoso
University of Minho, Guimarães, Portugal
Carlos A. Silva
University of Minho, Guimarães, Portugal
Sergio Pereira
University Hospital Inselspital, Bern, Switzerland
Raphael Meier

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vega, R., Greiner, R. (2018). Finding Effective Ways to (Machine) Learn fMRI-Based Classifiers from Multi-site Data. In: Stoyanov, D., et al. Understanding and Interpreting Machine Learning in Medical Image Computing Applications. MLCN DLF IMIMIC 2018 2018 2018. Lecture Notes in Computer Science(), vol 11038. Springer, Cham. https://doi.org/10.1007/978-3-030-02628-8_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-02628-8_4
Published: 24 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02627-1
Online ISBN: 978-3-030-02628-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics