Abstract
While many scientific workflow systems track and record data provenance, few tools have been developed that provide convenient and effective ways to access and explore this information. Two important ways for provenance information to be accessed and explored is through browsing (i.e., visualizing and navigating data and process dependencies) and querying (e.g., to select certain portions of provenance graphs or to determine if certain paths exist between items within a graph). We extend our prior work on representing and querying data provenance by showing how these can be effectively and efficiently combined into an interactive provenance browser. The browser allows different views of provenance to be explored and queried, where queries are expressed in a declarative graph-based provenance query language. Query results are expressed as provenance subgraphs, which can be further visualized and navigated through the browser. The browser supports a generic model of provenance that can be used with various workflow computation models, and has a direct translation to the Open Provenance Model. We present the provenance model, the query language, and describe the overall browser architecture and implementation.
Chapter PDF
Similar content being viewed by others
References
Moreau, L., et al.: The open provenance model. Technical Report 14979, ECS, Univ. of Southampton (2007)
Davidson, S.B., Boulakia, S.C., Eyal, A., Ludäscher, B., McPhillips, T.M., Bowers, S., Anand, M.K., Freire, J.: Provenance in scientific workflow systems. IEEE Data Eng. Bull. (2007)
Ludäscher, B., et al.: Scientific workflow management and the Kepler system. Concurr. Comput.: Pract. Exper. 18, 1039–1065 (2006)
Oinn, T., et al.: Taverna: lessons in creating a workflow environment for the life sciences. Concurr. Comput.: Pract. Exper. 18, 1067–1100 (2006)
Scheidegger, C., et al.: Tackling the provenance challenge one layer at a time. Comput.: Pract. Exper. 20, 473–483 (2008)
Chapman, A., et al.: Efficient provenance storage. In: SIGMOD (2008)
Bowers, S., McPhillips, T., Riddle, S., Anand, M., Ludäscher, B.: Kepler/pPOD: Scientific workflow and provenance support for assembling the tree of life. In: Freire, J., Koop, D., Moreau, L. (eds.) IPAW 2008. LNCS, vol. 5272, Springer, Heidelberg (2008)
Abiteboul, S., Quass, D., McHugh, J., Widom, J., Wiener, J.L.: The Lorel query language for semistructured data. Intl. J. on Digitial Libraries 1, 68–88 (1997)
Anand, M.K., Bowers, S., Ludäscher, B.: Techniques for efficiently querying scientific workflow provenance graphs. In: EDBT (2010)
Anand, M.K., Bowers, S., McPhilips, T., Ludäscher, B.: Exploring scientific workflow provenance using hybrid queries over nested data and lineage graphs. In: SSDBM (2009)
Anand, M.K., Bowers, S., McPhilips, T., Ludäscher, B.: Efficient provenance storage over nested data collections. In: EDBT (2009)
Biton, O., Boulakia, S.C., Davidson, S.B., Hara, C.S.: Querying and managing provenance through user views in scientific workflows. In: ICDE (2008)
Holland, D., Braun, U., Maclean, D., Muniswamy-Reddy, K.K., Seltzer, M.: A data model and query language suitable for provenance. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, Springer, Heidelberg (2006)
Missier, P., Belhajjame, K., Zhao, J., Goble, C.: Data lineage model for taverna workflows with lightweight annotation requirements. In: Freire, J., Koop, D., Moreau, L. (eds.) IPAW 2008. LNCS, vol. 5272, pp. 17–30. Springer, Heidelberg (2008)
Anand, M.K., Bowers, S., Ludäscher, B.: Provenance browser: Displaying and querying scientific workflow provenance graphs (Demo) In: ICDE (2010)
Anand, M.K., Bowers, S., Ludäscher, B.: A navigation model for exploring scientific workflow provenance graphs. In: WORKS (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Anand, M.K., Bowers, S., Altintas, I., Ludäscher, B. (2010). Approaches for Exploring and Querying Scientific Workflow Provenance Graphs. In: McGuinness, D.L., Michaelis, J.R., Moreau, L. (eds) Provenance and Annotation of Data and Processes. IPAW 2010. Lecture Notes in Computer Science, vol 6378. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17819-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-17819-1_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17818-4
Online ISBN: 978-3-642-17819-1
eBook Packages: Computer ScienceComputer Science (R0)