Addressing Scientific Rigor in Data Analytics Using Semantic Workflows
New NIH grants require establishing scientific rigor, i.e. applicants must provide evidence of strict application of the scientific method to ensure robust and unbiased experimental design, methodology, analysis, interpretation and reporting of results. Researchers must transparently report experimental details so others may reproduce and extend findings. Provenance can help accomplish these objectives; analytical workflows can be annotated with sufficient information for peers to understand methods and reproduce the intended results. We aim to produce enhancements to the ontology space including links between existing ontologies, terminology gap analysis and ontology terms to address gaps, and potentially a new ontology aimed at integrating the higher level data analysis planning concepts. We are developing a collection of techniques and tools to enable workflow recipes or plans to be more clearly and consistently shared, improve understanding of all analysis aspects and enable greater reuse and reproduction. We aim to show that semantic workflows can improve scientific rigor in data analysis and to demonstrate their impact in specific research domains.
KeywordsProvenance Ontologies Scientific rigor Reproducibility
Thanks to T. McPhillips of UIUC and B. Ludäscher of UC-Davis for help with YesWorkflow, D. Garijo and V. Ratnakar of USC ISI for help with WINGS, and NSF Grant No. 1331023.
- 1.NIH Grants: Funding: Rigor and Reproducibility. https://grants.nih.gov/reproducibility/index.htm
- 2.Repetitive flaws: Strict guidelines to improve the reproducibility of experiments are a welcome move, Nature Editorial. http://www.nature.com/news/repetitive-flaws-1.19192
- 3.Challenges In Irreproducible Research: Nature News & Comment Special. http://www.nature.com/news/reproducibility-1.17552
- 4.Bowers, S., Ludäscher, B.: Actor-oriented design of scientific workflows. In: Conceptual Modeling ER 2005: Proceedings of the 24th International Conference on Conceptual Modeling, Klagenfurt, Austria, October 2005Google Scholar
- 7.Davidson, S.B., Freire, J., J.: Provenance and scientific workflows: challenges and opportunities. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1345–1350. ACM Press, New York (2008)Google Scholar
- 10.McPhillips, T., et al.: Retrospective provenance without a runtime provenance recorder. In: USENIX Workshop on Theory and Practice of Provenance (2015)Google Scholar
- 11.DataONE Scientific Workflows, Provenance Working Group: ProvONE: a PROV extension data model for scientific workflow provenance. W3C unofficial draft, 27 March 2014. http://vcvcomputing.com/provone/provone.html
- 12.Lim, C., Lu, S., Chebotko, A., Fotouhi, F.: Prospective and retrospective provenance collection in scientific workflow environments. In: 2010 IEEE International Conference on Services Computing (2010)Google Scholar