Advertisement

Addressing Scientific Rigor in Data Analytics Using Semantic Workflows

  • John S. EricksonEmail author
  • John Sheehan
  • Kristin P. Bennett
  • Deborah L. McGuinness
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9672)

Abstract

New NIH grants require establishing scientific rigor, i.e. applicants must provide evidence of strict application of the scientific method to ensure robust and unbiased experimental design, methodology, analysis, interpretation and reporting of results. Researchers must transparently report experimental details so others may reproduce and extend findings. Provenance can help accomplish these objectives; analytical workflows can be annotated with sufficient information for peers to understand methods and reproduce the intended results. We aim to produce enhancements to the ontology space including links between existing ontologies, terminology gap analysis and ontology terms to address gaps, and potentially a new ontology aimed at integrating the higher level data analysis planning concepts. We are developing a collection of techniques and tools to enable workflow recipes or plans to be more clearly and consistently shared, improve understanding of all analysis aspects and enable greater reuse and reproduction. We aim to show that semantic workflows can improve scientific rigor in data analysis and to demonstrate their impact in specific research domains.

Keywords

Provenance Ontologies Scientific rigor Reproducibility 

Notes

Acknowledgements

Thanks to T. McPhillips of UIUC and B. Ludäscher of UC-Davis for help with YesWorkflow, D. Garijo and V. Ratnakar of USC ISI for help with WINGS, and NSF Grant No. 1331023.

References

  1. 1.
    NIH Grants: Funding: Rigor and Reproducibility. https://grants.nih.gov/reproducibility/index.htm
  2. 2.
    Repetitive flaws: Strict guidelines to improve the reproducibility of experiments are a welcome move, Nature Editorial. http://www.nature.com/news/repetitive-flaws-1.19192
  3. 3.
    Challenges In Irreproducible Research: Nature News & Comment Special. http://www.nature.com/news/reproducibility-1.17552
  4. 4.
    Bowers, S., Ludäscher, B.: Actor-oriented design of scientific workflows. In: Conceptual Modeling ER 2005: Proceedings of the 24th International Conference on Conceptual Modeling, Klagenfurt, Austria, October 2005Google Scholar
  5. 5.
    Oinn, T., et al.: Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20(17), 3045–3054 (2004)CrossRefGoogle Scholar
  6. 6.
    Gil, Y., et al.: Wings: intelligent workflow-based design of computational experiments. IEEE Intell. Syst. 26(1), 62–72 (2011)CrossRefGoogle Scholar
  7. 7.
    Davidson, S.B., Freire, J., J.: Provenance and scientific workflows: challenges and opportunities. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1345–1350. ACM Press, New York (2008)Google Scholar
  8. 8.
    Murta, L., Braganholo, V., Chirigati, F., Koop, D., Freire, J.: noWorkflow: capturing and analyzing provenance of scripts. In: Ludaescher, B., Plale, B. (eds.) IPAW 2014. LNCS, vol. 8628, pp. 71–83. Springer, Heidelberg (2015)CrossRefGoogle Scholar
  9. 9.
    McPhillips, T., et al.: YesWorkflow: a user-oriented, language-independent tool for recovering workflow information from scripts. Int. J. Digit. curation 10, 298–313 (2015)CrossRefGoogle Scholar
  10. 10.
    McPhillips, T., et al.: Retrospective provenance without a runtime provenance recorder. In: USENIX Workshop on Theory and Practice of Provenance (2015)Google Scholar
  11. 11.
    DataONE Scientific Workflows, Provenance Working Group: ProvONE: a PROV extension data model for scientific workflow provenance. W3C unofficial draft, 27 March 2014. http://vcvcomputing.com/provone/provone.html
  12. 12.
    Lim, C., Lu, S., Chebotko, A., Fotouhi, F.: Prospective and retrospective provenance collection in scientific workflow environments. In: 2010 IEEE International Conference on Services Computing (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • John S. Erickson
    • 1
    Email author
  • John Sheehan
    • 1
  • Kristin P. Bennett
    • 1
  • Deborah L. McGuinness
    • 1
  1. 1.Rensselaer Polytechnic InstituteTroyUSA

Personalised recommendations