Neuroimaging Data Provenance Using the LONI Pipeline Workflow Environment

  • Allan J. MacKenzie-Graham
  • Arash Payan
  • Ivo D. Dinov
  • John D. Van Horn
  • Arthur W. Toga
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5272)


Provenance, the description of the history of a set of data, has become important in the neurosciences with the proliferation of research consortia-related neuroimaging efforts. Knowledge about the origin, preprocessing, analysis and post hoc processing of neuroimaging volumes is essential for establishing data and results quality, the reproducibility of findings, and their scientific interpretation. Neuroimaging provenance also includes the specifics of the software routines, algorithmic parameters, and operating system settings that were employed in the analysis protocol. The LONI Pipeline ( is a Java-based workflow environment for the construction and execution of data processing streams. We have developed a provenance framework for describing the current and retrospective data state integrated with the LONI Pipeline workflow environment. Collection of provenance information under this framework alleviates much of the burden of documentation from the user while still providing a rich description of an image’s characteristics, as well as the description of the programs that interacted with that data. This combination of ease of use and highly descriptive meta-data will greatly facilitate the collection of provenance information from brain imaging workflows, encourage subsequent data and meta-data sharing, enhance peer-reviewed publication, and support multi-center collaboration.


Provenance Workflow Neuroimaging Grid Pipeline 


  1. 1.
    Murphy, S.N., et al.: A Web Portal that Enables Collaborative Use of Advanced Medical Image Processing and Informatics Tools through the Biomedical Informatics Research Network (BIRN). In: AMIA Annu. Symp. Proc., pp. 579–583 (2006)Google Scholar
  2. 2.
    Simmhan, Y.L., Plale, B., Gannon, D.: A survey of data provenance in e-science. Sigmod Record 34(3), 31–36 (2005)CrossRefGoogle Scholar
  3. 3.
    Moreau, L., et al.: Special Issue: The First Provenance Challenge. Concurrency and Computation: Practice & Experience (2007)Google Scholar
  4. 4.
    Zhao, Y., et al.: A notation and system for expressing and executing cleanly typed workflows on messy scientific data. Sigmod Record 34(3), 37–43 (2005)CrossRefGoogle Scholar
  5. 5.
    Zhao, Y., Wilde, M., Foster, I.: Applying the virtual data provenance model. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 148–161. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  6. 6.
    Liu, L., et al.: Multiple sclerosis medical image analysis and information management. J. Neuroimaging 15(4 suppl.), 103S–117S (2005)CrossRefGoogle Scholar
  7. 7.
    Fleisher, A.S., et al.: Identification of Alzheimer disease risk by functional magnetic resonance imaging. Arch. Neurol. 62(12), 1881–1888 (2005)CrossRefGoogle Scholar
  8. 8.
    Mueller, S.G., et al.: Ways toward an early diagnosis in Alzheimer’s disease: The Alzheimer’s Disease Neuroimaging Initiative (ADNI). Alzheimers Dement 1(1), 55–66 (2005)CrossRefGoogle Scholar
  9. 9.
    Rusinek, H., et al.: Regional brain atrophy rate predicts future cognitive decline: 6-year longitudinal MR imaging study of normal aging. Radiology 229(3), 691–696 (2003)CrossRefGoogle Scholar
  10. 10.
    Langen, M., et al.: Caudate nucleus is enlarged in high-functioning medication-naive subjects with autism. Biol. Psychiatry 62(3), 262–266 (2007)CrossRefGoogle Scholar
  11. 11.
    Drevets, W.C.: Neuroimaging studies of mood disorders. Biol. Psychiatry 48(8), 813–829 (2000)CrossRefGoogle Scholar
  12. 12.
    Narr, K.L., et al.: Asymmetries of cortical shape: Effects of handedness, sex and schizophrenia. Neuroimage 34(3), 939–948 (2007)CrossRefGoogle Scholar
  13. 13.
    Mazziotta, J.C., et al.: A probabilistic atlas of the human brain: theory and rationale for its development. The International Consortium for Brain Mapping (ICBM). Neuroimage 2(2), 89–101 (1995)Google Scholar
  14. 14.
    Van Horn, J.D., et al.: Sharing neuroimaging studies of human cognition. Nat. Neurosci. 7(5), 473–481 (2004)CrossRefGoogle Scholar
  15. 15.
    Erberich, S.G., et al.: Globus MEDICUS - Federation of DICOM Medical Imaging Devices into Healthcare Grids. Stud. Health Technol. Inform. 126, 269–278 (2007)Google Scholar
  16. 16.
    Martone, M.E., et al.: The cell-centered database: a database for multiscale structural and protein localization data from light and electron microscopy. Neuroinformatics 1(4), 379–395 (2003)CrossRefGoogle Scholar
  17. 17.
    Bidgood Jr., W.D., et al.: Understanding and using DICOM, the data interchange standard for biomedical imaging. J. Am. Med. Inform. Assoc. 4(3), 199–212 (1997)CrossRefGoogle Scholar
  18. 18.
    Zhao, J., et al.: Semantically linking and browsing provenance logs for e-science. In: Bouzeghoub, M., Goble, C.A., Kashyap, V., Spaccapietra, S. (eds.) ICSNW 2004. LNCS, vol. 3226, pp. 158–176. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  19. 19.
    Rex, D.E., Ma, J.Q., Toga, A.W.: The LONI Pipeline Processing Environment. Neuroimage 19(3), 1033–1048 (2003)CrossRefGoogle Scholar
  20. 20.
    Halfhill, T.R.: The Truth Behind the Pentium Bug. Byte (1995)Google Scholar
  21. 21.
    Vaughan, G.V.: GNU Autoconf, Automake, and Libtool, 1st edn., p. 390. New Riders, Indianapolis (2000)Google Scholar
  22. 22.
    Dale, A.M., Fischl, B., Sereno, M.I.: Cortical surface-based analysis. I. Segmentation and surface reconstruction. Neuroimage 9(2), 179–194 (1999)Google Scholar
  23. 23.
    Shattuck, D.W., Leahy, R.M.: BrainSuite: an automated cortical surface identification tool. Med. Image Anal. 6(2), 129–142 (2002)CrossRefGoogle Scholar
  24. 24.
    Sled, J.G., Zijdenbos, A.P., Evans, A.C.: A nonparametric method for automatic correction of intensity nonuniformity in MRI data. IEEE Trans. Med. Imaging 17(1), 87–97 (1998)CrossRefGoogle Scholar
  25. 25.
    Jenkinson, M., et al.: Improved optimization for the robust and accurate linear registration and motion correction of brain images. Neuroimage 17(2), 825–841 (2002)CrossRefGoogle Scholar
  26. 26.
    Jenkinson, M., Smith, S.: A global optimisation method for robust affine registration of brain images. Med. Image Anal. 5(2), 143–156 (2001)CrossRefGoogle Scholar
  27. 27.
    Smith, S.M., et al.: Advances in functional and structural MR image analysis and implementation as FSL. Neuroimage 23 (suppl.1), S208–219 (2004)CrossRefGoogle Scholar
  28. 28.
    Myers, J.D., et al.: A collaborative informatics infrastructure for multi-scale science. Cluster Computing-the Journal of Networks Software Tools and Applications 8(4), 243–253 (2005)Google Scholar
  29. 29.
    Oinn, T., et al.: Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20(17), 3045–3054 (2004)CrossRefGoogle Scholar
  30. 30.
    Keator, D.B., et al.: A general XML schema and SPM toolbox for storage of neuro-imaging results and anatomical labels. Neuroinformatics 4(2), 199–212 (2006)CrossRefGoogle Scholar
  31. 31.
    Freire, J., et al.: Provenance for computational tasks: A survey. Computing in Science & Engineering 10(3), 11–21 (2008)CrossRefGoogle Scholar
  32. 32.
    Smith, S.M.: Fast robust automated brain extraction. Hum. Brain Mapp. 17(3), 143–155 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Allan J. MacKenzie-Graham
    • 1
  • Arash Payan
    • 1
  • Ivo D. Dinov
    • 1
  • John D. Van Horn
    • 1
  • Arthur W. Toga
    • 1
  1. 1.Laboratory of Neuro Imaging (LONI), Department of NeurologyUniversity of California Los Angeles School of MedicineLos AngelesUSA

Personalised recommendations