Abstract
The maturation of in vivo neuroimaging has led to incredible quantities of digital information about the human brain. While much is made of the data deluge in science, neuroimaging represents the leading edge of this onslaught of “big data”. A range of neuroimaging databasing approaches has streamlined the transmission, storage, and dissemination of data from such brain imaging studies. Yet few, if any, common solutions exist to support the science of neuroimaging. In this article, we discuss how modern neuroimaging research represents a multifactorial and broad ranging data challenge, involving the growing size of the data being acquired; sociological and logistical sharing issues; infrastructural challenges for multi-site, multi-datatype archiving; and the means by which to explore and mine these data. As neuroimaging advances further, e.g. aging, genetics, and age-related disease, new vision is needed to manage and process this information while marshalling of these resources into novel results. Thus, “big data” can become “big” brain science.
Similar content being viewed by others
References
Almli, C. R., Rivkin, M. J., & McKinstry, R. C. (2007). The NIH MRI study of normal brain development (Objective-2): newborns, infants, toddlers, and preschoolers. NeuroImage, 35(1), 308–325.
Altman, R. B. (2003). The expanding scope of bioinformatics: sequence analysis and beyond. Heredity, 90(5), 345.
Arnone, D., Cavanagh, J., Gerber, D., Lawrie, S. M., Ebmeier, K. P., & McIntosh, A. M. (2009). Magnetic resonance imaging studies in bipolar disorder and schizophrenia: meta-analysis. The British Journal of Psychiatry, 195(3), 194–201.
Barry, R. L., Strother, S. C., Gatenby, J. C., & Gore, J. C. (2011). Data-driven optimization and evaluation of 2D EPI and 3D PRESTO for BOLD fMRI at 7 Tesla: I. Focal coverage. NeuroImage, 55(3), 1034–1043. PMCID: 3049844.
Basser, P. J., Pajevic, S., Pierpaoli, C., Duda, J., & Aldroubi, A. (2000). In vivo fiber tractography using DT-MRI data. Magnetic Resonance in Medicine, 44(4), 625–632.
Beaulieu, A. (2001). Voxels in the brain: neuroscience, informatics and changing notions of objectivity. Social Studies of Science, 31(5), 635–680.
Biswal, B. B., Mennes, M., Zuo, X. N., Gohel, S., Kelly, C., Smith, S. M., Beckmann, C. F., Adelstein, J. S., Buckner, R. L., Colcombe, S., Dogonowski, A. M., Ernst, M., Fair, D., Hampson, M., Hoptman, M. J., Hyde, J. S., Kiviniemi, V. J., Kotter, R., Li, S. J., Lin, C. P., Lowe, M. J., Mackay, C., Madden, D. J., Madsen, K. H., Margulies, D. S., Mayberg, H. S., McMahon, K., Monk, C. S., Mostofsky, S. H., Nagel, B. J., Pekar, J. J., Peltier, S. J., Petersen, S. E., Riedl, V., Rombouts, S. A., Rypma, B., Schlaggar, B. L., Schmidt, S., Seidler, R. D., G, J. S., Sorg, C., Teng, G. J., Veijola, J., Villringer, A., Walter, M., Wang, L., Weng, X. C., Whitfield-Gabrieli, S., Williamson, P., Windischberger, C., Zang, Y. F., Zhang, H. Y., Castellanos, F. X., Milham, M. P. (2010). Toward discovery science of human brain function. Proceedings of the National Academy of Sciences U S A.
Bowman, I., Joshi, S. H., & Van Horn, J. (2012). Visual systems for interactive exploration and mining of large-scale neuroimaging data archives. Frontiers in Neuroinformatics, 6.
Brookes, A. J. (2001). Rethinking genetic strategies to study complex diseases. Trends in Molecular Medicine, 7(11), 512–516.
Bug, W. J., Ascoli, G. A., Grethe, J. S., Gupta, A., Fennema-Notestine, C., Laird, A. R., Larson, S. D., Rubin, D., Shepherd, G. M., Turner, J. A., & Martone, M. E. (2008). The NIFSTD and BIRNLex vocabularies: building comprehensive ontologies for neuroscience. Neuroinformatics, 6(3), 175–194.
Bushnik, T., & Gordon, W. (2012). Updates from the Third Federal Interagency Conference on traumatic brain injury. The Journal of Head Trauma Rehabilitation, 27(3), 222–223.
Dinov, I., Van Horn, J., Lozev, K., Magsipoc, R., Petrosyan, P., Liu, Z., MacKenzie-Graham, A., Eggert, P., Parker, D., & Toga, A. (2009). Efficient, distributed and interactive neuroimaging data analysis using the LONI pipeline. Frontiers of Neuroinformatics, 3(22), 1–10.
Dinov, I., Lozev, K., Petrosyan, P., Liu, Z., Eggert, P., Pierce, J., Zamanyan, A., Chakrapani, S., Van Horn, J., Parker, D. S., Magsipoc, R., Leung, K., Gutman, B., Woods, R., Toga, A. (2010). Neuroimaging study designs, computational analyses and data provenance using the LONI pipeline. PLoS ONE, 5(9), PMCID: 2946935.
Dinov, I., Van Horn, J., Lozev, K., Magsipoc, R., Petrosyan, P., Liu, Z., MacKenzie-Graha, A., Eggert, P., Parker, D. S., & Toga, A. W. (2010b). Efficient, distributed and interactive neuroimaging data analysis using the LONI pipeline. Frontiers in Neuroinformatics, 3(22), 1–10.
Evangelou, E., Maraganore, D. M., Annesi, G., Brighina, L., Brice, A., Elbaz, A., Ferrarese, C., Hadjigeorgiou, G. M., Krueger, R., Lambert, J. C., Lesage, S., Markopoulou, K., Mellick, G. D., Meeus, B., Pedersen, N. L., Quattrone, A., Van Broeckhoven, C., Sharma, M., Silburn, P. A., Tan, E. K., Wirdefeldt, K., Ioannidis, J. P. (2009). Non-replication of association for six polymorphisms from meta-analysis of genome-wide association studies of Parkinson’s disease: Large-scale collaborative study. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics.
Evans, A. C. (2006). The NIH MRI study of normal brain development. NeuroImage, 30(1), 184–202.
Feinberg, D. A., Moeller, S., Smith, S. M., Auerbach, E., Ramanna, S., Glasser, M. F., Miller, K. L., Ugurbil, K., & Yacoub, E. (2010). Multiplexed echo planar imaging for sub-second whole brain FMRI and fast diffusion imaging. PLoS ONE, 5(12), e15710.
Frazier, T. W., & Hardan, A. Y. (2009). A meta-analysis of the corpus callosum in autism. Biological Psychiatry, 66(10), 935–941.
Frisoni, G. B. (2010). Alzheimer’s disease neuroimaging initiative in Europe. Alzheimer’s & Dementia, 6(3), 280–285.
Gorgolewski, K., Burns, C. D., Madison, C., Clark, D., Halchenko, Y. O., Waskom, M. L., Ghosh, S. S. (2011). Nipype: a flexible, lightweight and extensible neuroimaging data processing framework in python. Front Neuroinform, 5, 13, PMCID: 3159964.
Gray, J., Szalay, S., Thaker, A. R., Kunszt, P. Z., Malik, T., Raddick, J., Stoughton, C., vanden Berg, J. (2002). The SDSS SkyServer—public access to the Sloan Digital Sky Surver Data. ACM SIGMOD.
Hall, D., Huerta, M. F., McAuliffe, M. J., & Farber, G. K. (2012). Sharing heterogeneous data: the national database for autism research. Neuroinformatics, 10(4), 331–339.
Helmer, K. G., Ambite, J. L., Ames, J., Ananthakrishnan, R., Burns, G., Chervenak, A. L., Foster, I., Liming, L., Keator, D., Macciardi, F., Madduri, R., Navarro, J. P., Potkin, S., Rosen, B., Ruffins, S., Schuler, R., Turner, J. A., Toga, A., Williams, C., Kesselman, C. (2011). Enabling collaborative research using the Biomedical Informatics Research Network (BIRN). Journal of the American Medical Informatics Association.
Hood, L., Heath, J. R., Phelps, M. E., & Lin, B. (2004). Systems biology and new technologies enable predictive and preventative medicine. Science, 306(5696), 640–643.
Hsi-Yang Fritz, M., Leinonen, R., Cochrane, G., & Birney, E. (2011). Efficient storage of high throughput DNA sequencing data using reference-based compression. Genome Research, 21(5), 734–740. PMCID: 3083090.
Huang, H., Hu, Z. Z., Arighi, C. N., & Wu, C. H. (2007). Integration of bioinformatics resources for functional analysis of gene expression and proteomic data. Frontiers in Bioscience, 12, 5071–5088.
Jennings, R. G., & Van Horn, J. D. (2012). Publication bias in neuroimaging research: implications for meta-analyses. Neuroinformatics, 10(1), 67–80.
Jiang, T., Liu, Y., Shi, F., Shu, N., Liu, B., Jiang, J., & Zhou, Y. (2008). Multimodal magnetic resonance imaging for brain disorders: advances and perspectives. Brain Imaging and Behavior, 2(4), 249–257.
Jones, D. T., & Swindells, M. B. (2002). Getting the most from PSI-BLAST. Trends in Biochemical Sciences, 27(3), 161–164.
Keator, D. B., Helmer, K., Steffener, J., Turner, J. A., Van Erp, T. G., Gadde, S., Ashish, N., Burns, G. A., Nichols, B. N. (2013). Towards structured sharing of raw and derived neuroimaging data across existing resources. Neuroimage.
Koslow, S. H. (2000). Should the neuroscience community make a paradigm shift to sharing primary data? Nature Neuroscience, 3(4), 863–865.
Laird, A. R., Eickhoff, S. B., Kurth, F., Fox, P. M., Uecker, A. M., Turner, J. A., Robinson, J. L., Lancaster, J. L., & Fox, P. T. (2009). ALE meta-analysis workflows via the brainmap database: progress towards a probabilistic functional brain atlas. Front Neuroinform, 3, 23. PMCID: 2715269.
Ma, B., Tromp, J., & Li, M. (2002). PatternHunter: faster and more sensitive homology search. Bioinformatics, 18(3), 440–445.
Mackenzie-Graham, A. J., Van Horn, J. D., Woods, R. P., Crawford, K. L., & Toga, A. W. (2008). Provenance in neuroimaging. NeuroImage, 42(1), 178–195. PMCID: 2664747.
Marcus, D. S., Olsen, T. R., Ramaratnam, M., & Buckner, R. L. (2007). The extensible neuroimaging archive toolkit (XNAT): an informatics platform for managing, exploring, and sharing neuroimaging data. Neuroinformatics, 5, 11–34.
Marcus, D. S., Harms, M. P., Snyder, A. Z., Jenkinson, M., Wilson, J. A., Glasser, M. F., Barch, D. M., Archie, K. A., Burgess, G. C., Ramaratnam, M., Hodge, M., Horton, W., Herrick, R., Olsen, T., McKay, M., House, M., Hileman, M., Reid, E., Harwell, J., Coalson, T., Schindler, J., Elam, J. S., Curtiss, S. W., Van Essen, D. C. (2013). Human connectome project informatics: Quality control, database services, and data visualization. Neuroimage.
Mazziotta, J. C., Toga, A. W., Evans, A., Fox, P., & Lancaster, J. (1995). A probabilistic atlas of the human brain: theory and rationale for its development. The International Consortium for Brain Mapping (ICBM). NeuroImage, 2(2), 89–101.
Mazziotta, J., Toga, A. W., Evans, A., Fox, P., Lancaster, J., Ziles, K., Woods, R. P., Paus, T., Simpson, G., Pike, B., Holmes, C., Collins, L., Thompson, P. M., MacDonald, D., Iacoboni, M., Schormann, T., Amunts, K., Palomero-Gallagher, N., Geyer, S., Parsons, L. M., Narr, K., Kabani, N., LeGoualher, G., Boomsma, D., Cannon, T., Kawashima, R., & Mazoyer, B. (2001). A probabilistic atlas and reference system for the human brain: International Consortium for Brain Mapping (ICBM). Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 356, 1293–1322.
McClatchey, R., Branson, A., Anjum, A., Bloodsworth, P., Habib, I., Munir, K., Shamdasani, J., Soomro, K. (2013). Providing traceability for neuroimaging analyses. International Journal of Medical Informatics.
Neu, S. C., Valentino, D. J., & Toga, A. W. (2005). The LONI Debabeler: a mediator for neuroimaging software. NeuroImage, 24(4), 1170–1179.
Neu, S. C., Crawford, K. L., & Toga, A. W. (2012). Practical management of heterogeneous neuroimaging metadata by global neuroimaging data repositories. Front Neuroinform, 6, 8. PMCID: 3311229.
Persson, B. (2000). Bioinformatics in protein analysis. Exs, 88, 215–231.
Phan, K. L., Wager, T., Taylor, S. F., & Liberzon, I. (2002). Functional neuroanatomy of emotion: a meta-analysis of emotion activation studies in PET and fMRI. NeuroImage, 16(2), 331–348.
Poldrack, R. A., Fletcher, P. C., Henson, R. N., Worsley, K. J., Brett, M., & Nichols, T. E. (2008). Guidelines for reporting an fMRI study. NeuroImage, 40(2), 409–414. PMCID: 2287206.
Poldrack, R. A., Barch, D. M., Mitchell, J. P., Wager, T. D., Wagner, A. D., Devlin, J. T., Cumba, C., Koyejo, O., & Milham, M. P. (2013). Toward open sharing of task-based fMRI data: the OpenfMRI project. Front Neuroinform, 7, 12. PMCID: 3703526.
Poline, J. B., Breeze, J. L., Ghosh, S., Gorgolewski, K., Halchenko, Y. O., Hanke, M., Haselgrove, C., Helmer, K. G., Keator, D. B., Marcus, D. S., Poldrack, R. A., Schwartz, Y., Ashburner, J., & Kennedy, D. N. (2012). Data sharing in neuroimaging research. Front Neuroinform, 6, 9. PMCID: 3319918.
Saykin, A. J., Shen, L., Foroud, T. M., Potkin, S. G., Swaminathan, S., Kim, S., Risacher, S. L., Nho, K., Huentelman, M. J., Craig, D. W., Thompson, P. M., Stein, J. L., Moore, J. H., Farrer, L. A., Green, R. C., Bertram, L., Jack, C. R., & Weiner, M. W. (2010). ADNI biomarkers as quantitative phenotypes: genetics core aims, progress, and plans, Alzheimer’s and Dementia. Journal of the Alzheimer’s Association, 6(3), 265–273.
Schumann, G., Loth, E., Banaschewski, T., Barbot, A., Barker, G., Buchel, C., Conrod, P. J., Dalley, J. W., Flor, H., Gallinat, J., Garavan, H., Heinz, A., Itterman, B., Lathrop, M., Mallik, C., Mann, K., Martinot, J. L., Paus, T., Poline, J. B., Robbins, T. W., Rietschel, M., Reed, L., Smolka, M., Spanagel, R., Speiser, C., Stephens, D. N., Strohle, A., & Struve, M. (2010). The IMAGEN study: reinforcement-related behaviour in normal brain function and psychopathology. Molecular Psychiatry, 15(12), 1128–1139.
Schutte, B. C., Mitros, J. P., Bartlett, J. A., Walters, J. D., Jia, H. P., Welsh, M. J., Casavant, T. L., & McCray, P. B., Jr. (2002). Discovery of five conserved beta -defensin gene clusters using a computational search strategy. Proceedings of the National Academy of Sciences of the United States of America, 99(4), 2129–2133.
Stein, J. L., Medland, S. E., Vasquez, A. A., Hibar, D. P., Senstad, R. E., Winkler, A. M., Toro, R., Appel, K., Bartecek, R., Bergmann, O., Bernard, M., Brown, A. A., Cannon, D. M., Chakravarty, M. M., Christoforou, A., Domin, M., Grimm, O., Hollinshead, M., Holmes, A. J., Homuth, G., Hottenga, J. J., Langan, C., Lopez, L. M., Hansell, N. K., Hwang, K. S., Kim, S., Laje, G., Lee, P. H., Liu, X., Loth, E., Lourdusamy, A., Mattingsdal, M., Mohnke, S., Maniega, S. M., Nho, K., Nugent, A. C., O’Brien, C., Papmeyer, M., Putz, B., Ramasamy, A., Rasmussen, J., Rijpkema, M., Risacher, S. L., Roddey, J. C., Rose, E. J., Ryten, M., Shen, L., Sprooten, E., Strengman, E., Teumer, A., Trabzuni, D., Turner, J., van Eijk, K., van Erp, T. G., van Tol, M. J., Wittfeld, K., Wolf, C., Woudstra, S., Aleman, A., Alhusaini, S., Almasy, L., Binder, E. B., Brohawn, D. G., Cantor, R. M., Carless, M. A., Corvin, A., Czisch, M., Curran, J. E., Davies, G., de Almeida, M. A., Delanty, N., Depondt, C., Duggirala, R., Dyer, T. D., Erk, S., Fagerness, J., Fox, P. T., Freimer, N. B., Gill, M., Goring, H. H., Hagler, D. J., Hoehn, D., Holsboer, F., Hoogman, M., Hosten, N., Jahanshad, N., Johnson, M. P., Kasperaviciute, D., Kent, J. W., Jr., Kochunov, P., Lancaster, J. L., Lawrie, S. M., Liewald, D. C., Mandl, R., Matarin, M., Mattheisen, M., Meisenzahl, E., Melle, I., Moses, E. K., Muhleisen, T. W., Nauck, M., Nothen, M. M., Olvera, R. L., Pandolfo, M., Pike, G. B., Puls, R., Reinvang, I., Renteria, M. E., Rietschel, M., Roffman, J. L., Royle, N. A., Rujescu, D., Savitz, J., Schnack, H. G., Schnell, K., Seiferth, N., Smith, C., Steen, V. M., Valdes Hernandez, M. C., Van den Heuvel, M., van der Wee, N. J., Van Haren, N. E., Veltman, J. A., Volzke, H., Walker, R., Westlye, L. T., Whelan, C. D., Agartz, I., Boomsma, D. I., Cavalleri, G. L., Dale, A. M., Djurovic, S., Drevets, W. C., Hagoort, P., Hall, J., Heinz, A., Jack, C. R., Jr., Foroud, T. M., Le Hellard, S., Macciardi, F., Montgomery, G. W., Poline, J. B., Porteous, D. J., Sisodiya, S. M., Starr, J. M., Sussmann, J., Toga, A. W., Veltman, D. J., Walter, H., Weiner, M. W., Bis, J. C., Ikram, M. A., Smith, A. V., Gudnason, V., Tzourio, C., Vernooij, M. W., Launer, L. J., Decarli, C., Seshadri, S., Andreassen, O. A., Apostolova, L. G., Bastin, M. E., Blangero, J., Brunner, H. G., Buckner, R. L., Cichon, S., Coppola, G., de Zubicaray, G. I., Deary, I. J., Donohoe, G., de Geus, E. J., Espeseth, T., Fernandez, G., Glahn, D. C., Grabe, H. J., Hardy, J., Hulshoff Pol, H. E., Jenkinson, M., Kahn, R. S., McDonald, C., McIntosh, A. M., McMahon, F. J., McMahon, K. L., Meyer-Lindenberg, A., Morris, D. W., Muller-Myhsok, B., Nichols, T. E., Ophoff, R. A., Paus, T., Pausova, Z., Penninx, B. W., Potkin, S. G., Samann, P. G., Saykin, A. J., Schumann, G., Smoller, J. W., Wardlaw, J. M., Weale, M. E., Martin, N. G., Franke, B., Wright, M. J., & Thompson, P. M. (2012). Identification of common variants associated with human hippocampal and intracranial volumes. Nature Genetics.
Toga, A. W. (2002a). Imaging databases and neuroscience. The Neuroscientist, 8(5), 423–436.
Toga, A. W. (2002b). The laboratory of neuro imaging: what it is, why it is, and how it came to be. IEEE Transactions on Medical Imaging, 21(11), 1333–1343.
Toga, A. W., & Crawford, K. L. (2010). The informatics core of the Alzheimer’s disease neuroimaging initiative. Alzheimer’s and Dementia, 6(3), 247–256.
Toga, A. W., Clark, K. A., Thompson, P. M., Shattuck, D. W., & Van Horn, J. D. (2012). Mapping the human connectome. Neurosurgery, 71(1), 1–5.
Van Essen, D. C. (2005). A population-average, landmark- and surface-based (PALS) atlas of human cerebral cortex. NeuroImage, 28(3), 635–662.
Van Horn, J. D., & Ball, C. A. (2008). Domain-specific data sharing in neuroscience: what do we have to learn from each other? Neuroinformatics, 6(2), 117–121.
Van Horn, J. D., & Gazzaniga, M. S. (2005). In S. H. Koslow & A. Subramanian (Eds.), Maximizing information content in shared neuroimaging studies of cognitive function. Databasing the brain: From data to knowledge. New York: John Wiley and Sons.
Van Horn, J. D., & Gazzaniga, M. S. (2012). Why share data? Lessons learned from the fMRIDC. Neuroimage.
Van Horn, J. D., & Ishai, A. (2007). Mapping the human brain: new insights from FMRI data sharing. Neuroinformatics, 5(3), 146–153.
Van Horn, J. D., & Toga, A. W. (2009a). Is it time to re-prioritize neuroimaging databases and digital repositories? NeuroImage, 47(4), 1720–1734.
Van Horn, J. D., & Toga, A. W. (2009b). Multisite neuroimaging trials. Current Opinion in Neurology, 22(4), 370–378. PMCID: 2777976.
Van Horn, J. D., Grethe, J. S., Kostelec, P., Woodward, J. B., Aslam, J. A., Rus, D., Rockmore, D., & Gazzaniga, M. S. (2001). The Functional Magnetic Resonance Imaging Data Center (fMRIDC): the challenges and rewards of large-scale databasing of neuroimaging studies. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 356(1412), 1323–1339. PMCID: 1088517.
Van Horn, J. D., Wolfe, J., Agnoli, A., Woodward, J., Schmitt, M., Dobson, J., Schumacher, S., & Vance, B. (2005). Neuroimaging databases as a resource for scientific discovery. International Review of Neurobiology, 66, 55–87.
Wedeen, V. J., Wang, R. P., Schmahmann, J. D., Benner, T., Tseng, W. Y., Dai, G., Pandya, D. N., Hagmann, P., D’Arceuil, H., & de Crespigny, A. J. (2008). Diffusion spectrum magnetic resonance imaging (DSI) tractography of crossing fibers. NeuroImage, 41(4), 1267–1277.
Weiner, M. W., Veitch, D. P., Aisen, P. S., Beckett, L. A., Cairns, N. J., Green, R. C., Harvey, D., Jack, C. R., Jagust, W., Liu, E., Morris, J. C., Petersen, R. C., Saykin, A. J., Schmidt, M. E., Shaw, L., Siuciak, J. A., Soares, H., Toga, A. W., & Trojanowski, J. Q. (2012). The Alzheimer’s disease neuroimaging initiative: a review of papers published since its inception. Alzheimer’s & Dementia, 8(1 Suppl), S1–S68. PMCID: 3329969.
Zhan, L., Leow, A. D., Jahanshad, N., Chiang, M. C., Barysheva, M., Lee, A. D., Toga, A. W., McMahon, K. L., de Zubicaray, G. I., Wright, M. J., & Thompson, P. M. (2010). How does angular resolution affect diffusion imaging measures? NeuroImage, 49(2), 1357–1371.
Acknowledgments
This article was supported by a P41 (RR013642) award from the National Institute for General Medicine (NIGMS), through the National Institutes of Health Roadmap for Medical Research (U54 RR021813, The Center for Computational Biology—CCB), and through an National Institute of Mental Health ARRA award (RC1 MH088194). We express our gratitude the faculty and staff of the Laboratory of Neuro Imaging (LONI) at the University of California Los Angeles.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Van Horn, J.D., Toga, A.W. Human neuroimaging as a “Big Data” science. Brain Imaging and Behavior 8, 323–331 (2014). https://doi.org/10.1007/s11682-013-9255-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11682-013-9255-y