Abstract
Collecting, storing, tracking, and archiving scientific data is the main task of research data management, being the basis for scientific evaluations. In addition to the evaluation (a complex query in the case of structured databases) and the result itself, the important part of the original database used has also to be archived. To ensure reproducible and replicable research, the evaluation queries can be processed again at a later point in time in order to reproduce the result. Being able to calculate the origin of an evaluation is the main problem in provenance management, particularly in why and how data provenance. We are developing a tool called ProSA which combines data provenance and schema/data evolution using the CHASE for the different database transformations needed. Besides describing the main ideas of ProSA, another focus of this paper is the concrete use of our CHASE tool ChaTEAU for invertible query evaluation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
By the use of Provenance information.
References
Aho, A.V., Beeri, C., Ullman, J.D.: The theory of joins in relational databases. ACM Trans. Database Syst. 4(3), 297–314 (1979)
Amarilli, A., Bourhis, P., Senellart, P.: Provenance circuits for trees and treelike instances (extended version). CoRR abs/1511.08723 (2015)
Amsterdamer, Y., Deutch, D., Tannen, V.: Provenance for aggregate queries. In: PODS, pp. 153–164. ACM (2011)
Auge, T., Heuer, A.: Combining provenance management and schema evolution. In: Belhajjame, K., Gehani, A., Alper, P. (eds.) IPAW 2018. LNCS, vol. 11017, pp. 222–225. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98379-0_24
Auge, T., Heuer, A.: Inverses in research data management: combining provenance management, schema and data evolution (inverse im forschungsdatenmanagement). In: Grundlagen von Datenbanken. CEUR Workshop Proceedings, vol. 2126, pp. 108–113. CEUR-WS.org (2018)
Auge, T., Heuer, A.: The theory behind minimizing research data: result equivalent CHASE-inverse mappings. In: CEUR Workshop Proceedings of the LWDA, vol. 2191, pp. 1–12. CEUR-WS.org (2018)
Benczúr, A., Kiss, A., Márkus, T.: On a general class of data dependencies in the relational model and its implication problems. Comput. Math. Appl. 21(1), 1–11 (1991)
Benedikt, M., et al.: Benchmarking the chase. In: PODS, pp. 37–52. ACM (2017)
Benedikt, M., Leblay, J., Tsamoura, E.: PDQ: proof-driven query answering over web-based data. PVLDB 7(13), 1553–1556 (2014)
Bonifati, A., Ileana, I., Linardi, M.: ChaseFUN: a data exchange engine for functional dependencies at scale. In: EDBT, pp. 534–537. OpenProceedings.org (2017)
Bruder, I., Heuer, A., Schick, S., Spors, S.: Konzepte für das Forschungsdatenmanagement an der Universität Rostock (Concepts for the Management of Research Data at the University of Rostock). In: CEUR Workshop Proceedings of the LWDA, vol. 1917, p. 165. CEUR-WS.org (2017)
Bruder, I., et al.: Daten wie Sand am Meer - Datenerhebung, -strukturierung, -management und Data Provenance für die Ostseeforschung. Datenbank-Spektrum 17(2), 183–196 (2017). https://doi.org/10.1007/s13222-017-0259-4
Buneman, P., Khanna, S., Wang-Chiew, T.: Why and where: a characterization of data provenance. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 316–330. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44503-X_20
Deutsch, A., Hull, R.: Provenance-directed Chase&Backchase. In: Tannen, V., Wong, L., Libkin, L., Fan, W., Tan, W.C., Fourman, M. (eds.) In Search of Elegance in the Theory and Practice of Computation. LNCS, vol. 8000, pp. 227–236. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41660-6_11
Deutsch, A., Popa, L., Tannen, V.: Query reformulation with constraints. SIGMOD Rec. 35(1), 65–73 (2006)
Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: semantics and query answering. Theor. Comput. Sci. 336(1), 89–124 (2005)
Fagin, R., Kolaitis, P.G., Popa, L., Tan, W.C.: Quasi-inverses of schema mappings. ACM Trans. Database Syst. 33(2), 11:1–11:52 (2008)
Fagin, R., Kolaitis, P.G., Popa, L., Tan, W.C.: Schema mapping evolution through composition and inversion. In: Bellahsene, Z., Bonifati, A., Rahm, E. (eds.) Schema Matching and Mapping. Data-Centric Systems and Applications, pp. 191–222. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-16518-4_7
Geerts, F., Mecca, G., Papotti, P., Santoro, D.: That’s all folks! LLUNATIC goes open source. PVLDB 7(13), 1565–1568 (2014)
Greco, S., Molinaro, C., Spezzano, F.: Incomplete Data and Data Dependencies in Relational Databases. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, San Rafael (2012)
Green, T.J., Karvounarakis, G., Tannen, V.: Provenance semirings. In: PODS, pp. 31–40. ACM (2007)
Green, T.J., Tannen, V.: The semiring framework for database provenance. In: PODS, pp. 93–99. ACM (2017)
Grunert, H., Heuer, A.: Datenschutz im PArADISE. Datenbank-Spektrum 16(2), 107–117 (2016)
Grunert, H., Heuer, A.: Privacy protection through query rewriting in smart environments. In: EDBT, pp. 708–709. OpenProceedings.org (2016)
Grunert, H., Heuer, A.: Rewriting complex queries from cloud to fog under capability constraints to protect the users’ privacy. OJIOT 3(1), 31–45 (2017)
Grunert, H., Heuer, A.: Query rewriting by contract under privacy constraints. OJIOT 4(1), 54–69 (2018)
Halevy, A.Y.: Answering queries using views: a survey. VLDB J. 10(4), 270–294 (2001)
Herschel, M., Diestelkämper, R., Ben Lahmar, H.: A survey on provenance: what for? What form? What from? VLDB J. 26(6), 881–906 (2017)
Ileana, I., Cautis, B., Deutsch, A., Katsis, Y.: Complete yet practical search for minimal query reformulations under constraints. In: SIGMOD Conference, pp. 1015–1026. ACM (2014)
Jurklies, M.: CHASE und BACKCHASE: Entwicklung eines Universal-Werkzeugs für eine Basistechnik der Datenbankforschung. Master’s thesis, Universität Rostock (2018)
Köhler, S., Ludäscher, B., Zinn, D.: First-order provenance games. CoRR abs/1309.2655 (2013) http://arxiv.org/abs/1309.2655
Maier, D.: The Theory of Relational Databases. Computer Science Press (1983)
Maier, D., Mendelzon, A.O., Sagiv, Y.: Testing implications of data dependencies. ACM Trans. Database Syst. 4(4), 455–469 (1979)
Acknowledgements
We thank our students Fabian Renn and Frank Röger for their comparison of different CHASE tools like Llunatic and PDQ as well as Martin Jurklies for the basic implementation of our CHASE tool ChaTEAU.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Auge, T., Heuer, A. (2019). ProSA—Using the CHASE for Provenance Management. In: Welzer, T., Eder, J., Podgorelec, V., Kamišalić Latifić, A. (eds) Advances in Databases and Information Systems. ADBIS 2019. Lecture Notes in Computer Science(), vol 11695. Springer, Cham. https://doi.org/10.1007/978-3-030-28730-6_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-28730-6_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28729-0
Online ISBN: 978-3-030-28730-6
eBook Packages: Computer ScienceComputer Science (R0)