DataONE: A Data Federation with Provenance Support

  • Yang Cao
  • Christopher Jones
  • Víctor Cuevas-Vicenttín
  • Matthew B. Jones
  • Bertram Ludäscher
  • Timothy McPhillips
  • Paolo Missier
  • Christopher Schwalm
  • Peter Slaughter
  • Dave Vieglais
  • Lauren Walker
  • Yaxing Wei
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9672)

Abstract

DataONE is a federated data network focusing on earth and environmental science data. We present the provenance and search features of DataONE by means of an example involving three earth scientists who interact through a DataONE Member Node. DataONE provenance systems enable reproducible research and facilitate proper attribution of scientific results transitively across generations of derived data products.

References

  1. 1.
    Cao, Y., Jones, C., Cuevas-Vicenttín, V., Jones, M.B., Ludäscher, B., McPhillips, T., Missier, P., Schwalm, C., Slaughter, P., Vieglais, D., Walker, L., Wei, Y.: DataONE: A Data Federation with Provenance Support, Demo-Paper (long version) (2016). https://github.com/DataONEorg/provweek2016-demo/blob/master/dataone-demo-latex-version/dataone-prov-demo-long.pdf
  2. 2.
    Cuevas-Vicenttín, V., et al.: ProvONE: A PROV Extension Data Model for Scientific Workflow Provenance (2015). https://purl.dataone.org/provone-v1-dev
  3. 3.
    Data Observation Network for Earth (DataONE). www.dataone.org, search.dataone.org
  4. 4.
  5. 5.
    Freire, J., Koop, D., Santos, E., Silva, C.T.: Provenance for computational tasks: a survey. Comput. Sci. Eng. 10(3), 11–21 (2008)CrossRefGoogle Scholar
  6. 6.
    Huntzinger, D., Schwalm, C., Wei, Y., Cook, R., Michalak, A., Schaefer, K., Jacobson, A., Arain, M., Ciais, P., Fisher, J., Hayes, D., Huang, M., Huang, S., Ito, A., Jain, A., Lei, H., Lu, C., Maignan, F., Mao, J., Parazoo, N., Peng, C., Peng, S., Poulter, B., Ricciuto, D., Tian, H., Shi, X., Wang, W., Zeng, N., Zhao, F., Zhu, Q.: NACP MsTMIP: Global 0.5-deg Terrestrial Biosphere Model Outputs (version 1) in Standard Format. http://dx.doi.org/10.3334/ORNLDAAC/1225
  7. 7.
    Jones, C., Cao, Y., Slaughter, P., Jones, M.B.: MATLAB DataONE Toolbox (2016). https://github.com/DataONEorg/matlab-dataone
  8. 8.
    Katz, D.S., Smith, A.M.: Implementing Transitive Credit with JSON-LD. CoRR abs/1407.5117 (2014). http://arxiv.org/abs/1407.5117
  9. 9.
    McPhillips, T., Song, T., Kolisnik, T., Aulenbach, S., Belhajjame, K., Bocinsky, K., Cao, Y., Chirigati, F., Dey, S., Freire, J., Huntzinger, D., Jones, C., Koop, D., Missier, P., Schildhauer, M., Schwalm, C., Wei, Y., Cheney, J., Bieda, M., Ludäscher, B.: YesWorkflow: a user-oriented, language-independent tool for recovering workflow information from scripts. Int. J. Digit. Curation 10, 298–313 (2015). http://www.ijdc.net/index.php/ijdc/article/view/10.1.298 CrossRefGoogle Scholar
  10. 10.
    Missier, P.: Data trajectories: tracking reuse of published data for transitive credit attribution. In: Proceedings of the 11th International Data Curation Conference, DCC (2016). http://homepages.cs.ncl.ac.uk/paolo.missier/doc/DT.pdf
  11. 11.
    Missier, P., Ludäscher, B., Bowers, S., Anand, M.K., Altintas, I., Dey, S., Sarkar, A., Shrestha, B., Goble, C.: Linking multiple workflow provenance traces for interoperable collaborative science. In: 5th Workshop on Workflows in Support of Large-Scale Science (WORKS) (2010). http://www.dataone.org/sites/all/documents/DataTol.pdf
  12. 12.
    W3C PROV-O: The PROV Ontology. https://www.w3.org/TR/prov-o/
  13. 13.
  14. 14.
    Slaughter, P., Jones, M.B., Jones, C.: recordr: provenance tracking for R (2016). https://github.com/NCEAS/recordr
  15. 15.
    Wei, Y., Liu, S., Huntzinger, D., Michalak, A., Viovy, N., Post, W., Schwalm, C., Schaefer, K., Jacobson, A., Lu, C., Tian, H., Ricciuto, D., Cook, R., Mao, J., Shi, X.: NACP MsTMIP: Global and North American Driver Data for Multi-Model Intercomparison (2014). http://dx.doi.org/10.3334/ORNLDAAC/1220
  16. 16.

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Yang Cao
    • 1
  • Christopher Jones
    • 2
  • Víctor Cuevas-Vicenttín
    • 3
  • Matthew B. Jones
    • 2
  • Bertram Ludäscher
    • 1
  • Timothy McPhillips
    • 1
  • Paolo Missier
    • 4
  • Christopher Schwalm
    • 5
  • Peter Slaughter
    • 2
  • Dave Vieglais
    • 6
  • Lauren Walker
    • 2
  • Yaxing Wei
    • 7
  1. 1.University of Illinois, Urbana-ChampaignIllinoisUSA
  2. 2.National Center for Ecological Analysis and SynthesisUCSBSanta BarbaraUSA
  3. 3.Universidad Popular Autónoma del Estado de PueblaPueblaMexico
  4. 4.School of Computing ScienceNewcastle UniversityNewcastle upon TyneUK
  5. 5.Woods Hole Research CenterFalmouthUSA
  6. 6.University of KansasLawrenceUSA
  7. 7.Environmental Sciences DivisionORNLOak RidgeUSA

Personalised recommendations