Abstract
Given an output table T that is the result of some unknown query on a database D, Query Reverse Engineering (QRE) computes one or more target query Q such that the result of Q on D is T. A fundamental challenge in QRE is how to efficiently compute target queries given its large search space. In this paper, we focus on the QRE problem for PJ\(^+\) queries, which is a more expressive class of queries than project-join queries by supporting antijoins as well as inner joins. To enhance efficiency, we propose a novel query-centric approach consisting of table partitioning, precomputation, and indexing techniques. Our experimental study demonstrates that our approach significantly outperforms the state-of-the-art solution by an average improvement factor of 120.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In contrast, our approach took 3s to reverse engineer this query (Sect. 6).
- 2.
Although our experiments focus on queries with foreign-key joins (similar to all competing approaches [8, 20]), our approach can be easily extended to reverse engineer PJ queries with non-foreign key join predicates. The main extension is to explicitly annotate the database schema graph with additional join edges.
- 3.
Note that this example is not related to the example in Fig. 2.
- 4.
We did not compare against FastQRE [8] for two reasons. First, FastQRE supports only CPJ queries which are even more restrictive than PJ queries. Second, the code for FastQRE is not available, and its non-trivial implementation requires modification to a database system engine to utilize its query optimizer’s cost model for ranking candidate queries.
- 5.
The code for STAR was based on a version obtained from the authors of [20].
References
Abouzied, A., Angluin, D., Papadimitriou, C., Hellerstein, J.M., Silberschatz, A.: Learning and verifying quantified Boolean queries by example. In: PODS (2013)
Arenas, M., Diaz, G.I.: The exact complexity of the first-order logic definability problem. ACM TODS 41(2), 13:1–13:14 (2016)
Bonifati, A., Ciucanu, R., Staworko, S.: Learning join queries from user examples. ACM TODS 40, 1–38 (2016)
Das Sarma, A., Parameswaran, A., Garcia-Molina, H., Widom, J.: Synthesizing view definitions from data. In: ICDT (2010)
Gao, Y., Liu, Q., Chen, G., Zheng, B., Zhou, L.: Answering why-not questions on reverse top-k queries. PVLDB 8, 738–749 (2015)
He, Z., Lo, E.: Answering why-not questions on top-k queries. In: ICDE (2012)
He, Z., Lo, E.: Answering why-not questions on top-k queries. TKDE 26, 1300–1315 (2014)
Kalashnikov, D.V., Lakshmanan, L.V., Srivastava, D.: FastQRE: fast query reverse engineering. In: SIGMOD (2018)
Li, H., Chan, C.Y., Maier, D.: Query from examples: an iterative, data-driven approach to query construction. PVLDB 8, 2158–2169 (2015)
Li, M., Chan, C.Y.: Efficient query reverse engineering using table fragments. Technical report (2019)
Liu, Q., Gao, Y., Chen, G., Zheng, B., Zhou, L.: Answering why-not and why questions on reverse top-k queries. VLDB J. 25, 867–892 (2016)
Luo, Y., Fletcher, G.H.L., Hidders, J., Wu, Y., Bra, P.D.: External memory k-bisimulation reduction of big graphs. In: ACM CIKM, pp. 919–928 (2013)
Panev, K., Michel, S., Milchevski, E., Pal, K.: Exploring databases via reverse engineering ranking queries with paleo. PVLDB 13, 1525–1528 (2016)
Psallidas, F., Ding, B., Chakrabarti, K., Chaudhuri, S.: S4: top-k spreadsheet-style search for query discovery. In: SIGMOD (2015)
Shen, Y., Chakrabarti, K., Chaudhuri, S., Ding, B., Novik, L.: Discovering queries based on example tuples. In: SIGMOD (2014)
Tan, W.C., Zhang, M., Elmeleegy, H., Srivastava, D.: Reverse engineering aggregation queries. PVLDB 10, 1394–1405 (2017)
Tran, Q.T., Chan, C.Y.: How to conquer why-not questions. In: SIGMOD (2010)
Tran, Q.T., Chan, C.Y., Parthasarathy, S.: Query by output. In: SIGMOD (2009)
Weiss, Y.Y., Cohen, S.: Reverse engineering SPJ-queries from examples. In: PODS (2017)
Zhang, M., Elmeleegy, H., Procopiuc, C.M., Srivastava, D.: Reverse engineering complex join queries. In: SIGMOD (2013)
Acknowledgements
We would like to thank Meihui Zhang for sharing the code of STAR. This research is supported in part by MOE Grant R-252-000-A53-114.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, M., Chan, CY. (2020). Efficient Query Reverse Engineering Using Table Fragments. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12114. Springer, Cham. https://doi.org/10.1007/978-3-030-59419-0_25
Download citation
DOI: https://doi.org/10.1007/978-3-030-59419-0_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59418-3
Online ISBN: 978-3-030-59419-0
eBook Packages: Computer ScienceComputer Science (R0)