Skip to main content

Addressing Scientific Rigor in Data Analytics Using Semantic Workflows

  • Conference paper
  • First Online:
Provenance and Annotation of Data and Processes (IPAW 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9672))

Included in the following conference series:

Abstract

New NIH grants require establishing scientific rigor, i.e. applicants must provide evidence of strict application of the scientific method to ensure robust and unbiased experimental design, methodology, analysis, interpretation and reporting of results. Researchers must transparently report experimental details so others may reproduce and extend findings. Provenance can help accomplish these objectives; analytical workflows can be annotated with sufficient information for peers to understand methods and reproduce the intended results. We aim to produce enhancements to the ontology space including links between existing ontologies, terminology gap analysis and ontology terms to address gaps, and potentially a new ontology aimed at integrating the higher level data analysis planning concepts. We are developing a collection of techniques and tools to enable workflow recipes or plans to be more clearly and consistently shared, improve understanding of all analysis aspects and enable greater reuse and reproduction. We aim to show that semantic workflows can improve scientific rigor in data analysis and to demonstrate their impact in specific research domains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    W3C PROV-O refers to these as “plans.” http://www.w3.org/TR/prov-o/#Plan.

  2. 2.

    For example: Kepler [4], Taverna [5], WINGS [6], etc.

  3. 3.

    https://www.w3.org/RDF/.

  4. 4.

    https://www.w3.org/TR/prov-o/.

  5. 5.

    “Prospective provenance” refers to a workflow’s “plan” or “recipe.” See [12].

  6. 6.

    We know of no ontology that enables scripted workflow processes accomplishing semantically similar tasks to be annotated in the same way using the same vocabulary.

  7. 7.

    http://stato-ontology.org/.

References

  1. NIH Grants: Funding: Rigor and Reproducibility. https://grants.nih.gov/reproducibility/index.htm

  2. Repetitive flaws: Strict guidelines to improve the reproducibility of experiments are a welcome move, Nature Editorial. http://www.nature.com/news/repetitive-flaws-1.19192

  3. Challenges In Irreproducible Research: Nature News & Comment Special. http://www.nature.com/news/reproducibility-1.17552

  4. Bowers, S., Ludäscher, B.: Actor-oriented design of scientific workflows. In: Conceptual Modeling ER 2005: Proceedings of the 24th International Conference on Conceptual Modeling, Klagenfurt, Austria, October 2005

    Google Scholar 

  5. Oinn, T., et al.: Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20(17), 3045–3054 (2004)

    Article  Google Scholar 

  6. Gil, Y., et al.: Wings: intelligent workflow-based design of computational experiments. IEEE Intell. Syst. 26(1), 62–72 (2011)

    Article  Google Scholar 

  7. Davidson, S.B., Freire, J., J.: Provenance and scientific workflows: challenges and opportunities. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1345–1350. ACM Press, New York (2008)

    Google Scholar 

  8. Murta, L., Braganholo, V., Chirigati, F., Koop, D., Freire, J.: noWorkflow: capturing and analyzing provenance of scripts. In: Ludaescher, B., Plale, B. (eds.) IPAW 2014. LNCS, vol. 8628, pp. 71–83. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  9. McPhillips, T., et al.: YesWorkflow: a user-oriented, language-independent tool for recovering workflow information from scripts. Int. J. Digit. curation 10, 298–313 (2015)

    Article  Google Scholar 

  10. McPhillips, T., et al.: Retrospective provenance without a runtime provenance recorder. In: USENIX Workshop on Theory and Practice of Provenance (2015)

    Google Scholar 

  11. DataONE Scientific Workflows, Provenance Working Group: ProvONE: a PROV extension data model for scientific workflow provenance. W3C unofficial draft, 27 March 2014. http://vcvcomputing.com/provone/provone.html

  12. Lim, C., Lu, S., Chebotko, A., Fotouhi, F.: Prospective and retrospective provenance collection in scientific workflow environments. In: 2010 IEEE International Conference on Services Computing (2010)

    Google Scholar 

Download references

Acknowledgements

Thanks to T. McPhillips of UIUC and B. Ludäscher of UC-Davis for help with YesWorkflow, D. Garijo and V. Ratnakar of USC ISI for help with WINGS, and NSF Grant No. 1331023.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John S. Erickson .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Erickson, J.S., Sheehan, J., Bennett, K.P., McGuinness, D.L. (2016). Addressing Scientific Rigor in Data Analytics Using Semantic Workflows. In: Mattoso, M., Glavic, B. (eds) Provenance and Annotation of Data and Processes. IPAW 2016. Lecture Notes in Computer Science(), vol 9672. Springer, Cham. https://doi.org/10.1007/978-3-319-40593-3_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-40593-3_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-40592-6

  • Online ISBN: 978-3-319-40593-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics