Abstract
The launch of the US BRAIN and European Human Brain Projects coincides with growing international efforts toward transparency and increased access to publicly funded research in the neurosciences. The need for data-sharing standards and neuroinformatics infrastructure is more pressing than ever. However, 'big science' efforts are not the only drivers of data-sharing needs, as neuroscientists across the full spectrum of research grapple with the overwhelming volume of data being generated daily and a scientific environment that is increasingly focused on collaboration. In this commentary, we consider the issue of sharing of the richly diverse and heterogeneous small data sets produced by individual neuroscientists, so-called long-tail data. We consider the utility of these data, the diversity of repositories and options available for sharing such data, and emerging best practices. We provide use cases in which aggregating and mining diverse long-tail data convert numerous small data sources into big data for improved knowledge about neuroscience-related disorders.
References
Huerta, M.F., Koslow, S.H. & Leshner, A.I. Trends Neurosci. 16, 436–438 (1993).
Roysam, B., Shain, W. & Ascoli, G.A. Neuroinformatics 7, 1–5 (2009).
National Institutes of Health. NIH Program Announcement NOT-MH-05–014, http://grants.nih.gov/grants/guide/notice-files/NOT-MH-05-014.html (2005).
Shepherd, G.M. et al. Trends Neurosci. 21, 460–468 (1998).
Weinberg, A.M. Science 134, 161–164 (1961).
Wallis, J.C., Rolando, E. & Borgman, C.L. PLoS ONE 8, e67332 (2013).
Chan, A.W. et al. Lancet 383, 257–266 (2014).
Ascoli, G.A., Donohue, D.E. & Halavi, M. J. Neurosci. 27, 9247–9251 (2007).
Gardner, D. et al. Neuroinformatics 6, 149–160 (2008).
Gardner, D. et al. Neuroinformatics 1, 289–295 (2003).
Boline, J., Lee, E.F. & Toga, A.W. Front. Neurosci. 2, 100–106 (2008).
Van Horn, J.D. & Gazzaniga, M.S. Neuroimage 82, 677–682 (2013).
Perrino, T. et al. Perspect. Psychol. Sci. 8, 433–444 (2013).
Poline, J.B. & Poldrack, R.A. Front. Neurosci. 6, 96 (2012).
Poldrack, R.A. et al. Front. Neuroinform. 7, 12 (2013).
Steward, O., Popovich, P.G., Dietrich, W.D. & Kleitman, N. Exp. Neurol. 233, 597–605 (2012).
Wicherts, J.M., Bakker, M. & Molenaar, D. PLoS ONE 6, e26828 (2011).
Heidorn, P.B. Libr. Trends 57, 280–299 (2008).
Mueck, L. Nat. Nanotechnol. 8, 693–695 (2013).
Sena, E.S., van der Worp, H.B., Bath, P.M., Howells, D.W. & Macleod, M.R. PLoS Biol. 8, e1000344 (2010).
Fawcett, J.W. et al. Spinal Cord 45, 190–205 (2007).
Lemmon, V.P. et al. J. Neurotrauma 31, 1354–1361 (2014).
Nielson, J.L. et al. J. Neurotrauma doi:10.1089/neu.2014.3399 (31 July 2014).
Fisher, M. et al. Stroke 40, 2244–2250 (2009).
Kwon, B.K., Hillyer, J. & Tetzlaff, W. J. Neurotrauma 27, 21–33 (2010).
Marmarou, A. et al. J. Neurotrauma 24, 239–250 (2007).
Maas, A.I. et al. J. Neurotrauma 28, 177–187 (2011).
Manley, G.T. & Maas, A.I. J. Am. Med. Assoc. 310, 473–474 (2013).
Yue, J.K. et al. J. Neurotrauma 30, 1831–1844 (2013).
Steyerberg, E.W. et al. PLoS Med. 5, e165 (2008).
Yuh, E.L. et al. Ann. Neurol. 73, 224–235 (2013).
Ferguson, A.R. et al. PLoS ONE 8, e59712 (2013).
Turner, C.F. et al. Database (Oxford) 2011, bar043 (2011).
Turner, J.A. et al. Front. Neuroinform. 4, 10 (2010).
Tenopir, C. et al. PLoS ONE 6, e21101 (2011).
Roche, D.G. et al. PLoS Biol. 12, e1001779 (2014).
Boulton, G., Rawlins, M., Vallance, P. & Walport, M. Lancet 377, 1633–1635 (2011).
Bohannon, J. Science 344, 788–789 (2014).
Agarwal, G. et al. Science 344, 626–630 (2014).
Cragin, M.H., Palmer, C.L., Carlson, J.R. & Witt, M. Philos. Trans. A Math. Phys. Eng. Sci. 368, 4023–4038 (2010).
Halavi, M., Hamilton, K.A., Parekh, R. & Ascoli, G.A. Front. Neurosci. 6, 49 (2012).
Martone, M.E. et al. J. Struct. Biol. 138, 145–155 (2002).
Fernandez, J.J. BMC Bioinformatics 10, 178 (2009).
Goodman, A. et al. PLoS Comput. Biol. 10, e1003542 (2014).
Gorgolewski, K.J., Margulies, D.S. & Milham, M.P. Front. Neurosci. 7, 9 (2013).
Gorgolewski, K.J. et al. Gigascience 2, 6 (2013).
Klein, T. et al. Data Sci. J. 12, 1–9 (2013).
The Future of Research Communications and e-Scholarship (FORCE11). Joint Declaration of Data Citation Principles–FINAL, https://www.force11.org/datacitation (2013).
Research Data Alliance. Research data sharing without barriers, https://rd-alliance.org/group/data-citation-wg.html (2014).
Van Essen, D.C. et al. Neuroimage 80, 62–79 (2013).
Mennes, M., Biswal, B.B., Castellanos, F.X. & Milham, M.P. Neuroimage 82, 683–691 (2013).
The Royal Society. Science as an open enterprise, https://royalsociety.org/policy/projects/science-public-enterprise/Report/ (2012).
Kennedy, D.N. Neuroinformatics 12, 361–363 (2014).
Costa L.F., Zawadzki, K., Miazaki, M., Viana, M.P. & Taraskin, S.N. Front. Comput. Neurosci. 4, 150 (2010).
Hansen, M.B., Jespersen, S.N., Leigland, L.A. & Kroenke, C.D. Front. Integr. Neurosci. 7, 31 (2013).
Martone, M.E., Gupta, A. & Ellisman, M.H. Nat. Neurosci. 7, 467–472 (2004).
Maas, A.I. et al. Lancet Neurol. 12, 1200–1210 (2013).
Acknowledgements
We thank the NIF staff, especially B. Ozyurt for his text mining expertise and tools that contributed substantially to Supplementary Table 1. The Neuroscience Information Framework is supported by a contract from the NIH Neuroscience Blueprint HHSN271200800035C via the National Institute on Drug Abuse. VISION-SCI is supported by NIH grants NS067092 (A.R.F.) and NS079030 (J.L.N.), and the Craig H. Neilsen foundation (A.R.F.) and Wings for Life foundation (A.R.F). This material is based on (M.H.C.) work supported while serving at the National Science Foundation. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not reflect the views of the National Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
M.E. Martone is the principal investigator of the Neuroscience Information Framework. A.E. Bandrowski is the NIF Project Leader. A.R. Ferguson, J.L. Nielson and M.H. Cragin are not affiliated with NIF.
Supplementary information
Supplementary Table
A sample of Neuroscience-centered data repositories available to the community. (PDF 327 kb)
Rights and permissions
About this article
Cite this article
Ferguson, A., Nielson, J., Cragin, M. et al. Big data from small data: data-sharing in the 'long tail' of neuroscience. Nat Neurosci 17, 1442–1447 (2014). https://doi.org/10.1038/nn.3838
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nn.3838
- Springer Nature America, Inc.
This article is cited by
-
ezBIDS: Guided standardization of neuroimaging data interoperable with major data archives and platforms
Scientific Data (2024)
-
Diagnosis of autism spectrum disorder based on functional brain networks and machine learning
Scientific Reports (2023)
-
Increasing Rigor of Preclinical Research to Maximize Opportunities for Translation
Neurotherapeutics (2023)
-
Enhancing Multi-disease Diagnosis of Chest X-rays with Advanced Deep-learning Networks in Real-world Data
Journal of Digital Imaging (2023)
-
Constructing the rodent stereotaxic brain atlas: a survey
Science China Life Sciences (2022)