Efficient Query Reverse Engineering Using Table Fragments

Li, Meiying; Chan, Chee-Yong

doi:10.1007/978-3-030-59419-0_25

Meiying Li¹⁴ &
Chee-Yong Chan¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12114))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

2108 Accesses
1 Citations

Abstract

Given an output table T that is the result of some unknown query on a database D, Query Reverse Engineering (QRE) computes one or more target query Q such that the result of Q on D is T. A fundamental challenge in QRE is how to efficiently compute target queries given its large search space. In this paper, we focus on the QRE problem for PJ\(^+\) queries, which is a more expressive class of queries than project-join queries by supporting antijoins as well as inner joins. To enhance efficiency, we propose a novel query-centric approach consisting of table partitioning, precomputation, and indexing techniques. Our experimental study demonstrates that our approach significantly outperforms the state-of-the-art solution by an average improvement factor of 120.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In contrast, our approach took 3s to reverse engineer this query (Sect. 6).
2.
Although our experiments focus on queries with foreign-key joins (similar to all competing approaches [8, 20]), our approach can be easily extended to reverse engineer PJ queries with non-foreign key join predicates. The main extension is to explicitly annotate the database schema graph with additional join edges.
3.
Note that this example is not related to the example in Fig. 2.
4.
We did not compare against FastQRE [8] for two reasons. First, FastQRE supports only CPJ queries which are even more restrictive than PJ queries. Second, the code for FastQRE is not available, and its non-trivial implementation requires modification to a database system engine to utilize its query optimizer’s cost model for ranking candidate queries.
5.
The code for STAR was based on a version obtained from the authors of [20].

References

Abouzied, A., Angluin, D., Papadimitriou, C., Hellerstein, J.M., Silberschatz, A.: Learning and verifying quantified Boolean queries by example. In: PODS (2013)
Google Scholar
Arenas, M., Diaz, G.I.: The exact complexity of the first-order logic definability problem. ACM TODS 41(2), 13:1–13:14 (2016)
Article MathSciNet Google Scholar
Bonifati, A., Ciucanu, R., Staworko, S.: Learning join queries from user examples. ACM TODS 40, 1–38 (2016)
Article MathSciNet Google Scholar
Das Sarma, A., Parameswaran, A., Garcia-Molina, H., Widom, J.: Synthesizing view definitions from data. In: ICDT (2010)
Google Scholar
Gao, Y., Liu, Q., Chen, G., Zheng, B., Zhou, L.: Answering why-not questions on reverse top-k queries. PVLDB 8, 738–749 (2015)
Google Scholar
He, Z., Lo, E.: Answering why-not questions on top-k queries. In: ICDE (2012)
Google Scholar
He, Z., Lo, E.: Answering why-not questions on top-k queries. TKDE 26, 1300–1315 (2014)
Google Scholar
Kalashnikov, D.V., Lakshmanan, L.V., Srivastava, D.: FastQRE: fast query reverse engineering. In: SIGMOD (2018)
Google Scholar
Li, H., Chan, C.Y., Maier, D.: Query from examples: an iterative, data-driven approach to query construction. PVLDB 8, 2158–2169 (2015)
Google Scholar
Li, M., Chan, C.Y.: Efficient query reverse engineering using table fragments. Technical report (2019)
Google Scholar
Liu, Q., Gao, Y., Chen, G., Zheng, B., Zhou, L.: Answering why-not and why questions on reverse top-k queries. VLDB J. 25, 867–892 (2016)
Article Google Scholar
Luo, Y., Fletcher, G.H.L., Hidders, J., Wu, Y., Bra, P.D.: External memory k-bisimulation reduction of big graphs. In: ACM CIKM, pp. 919–928 (2013)
Google Scholar
Panev, K., Michel, S., Milchevski, E., Pal, K.: Exploring databases via reverse engineering ranking queries with paleo. PVLDB 13, 1525–1528 (2016)
Google Scholar
Psallidas, F., Ding, B., Chakrabarti, K., Chaudhuri, S.: S4: top-k spreadsheet-style search for query discovery. In: SIGMOD (2015)
Google Scholar
Shen, Y., Chakrabarti, K., Chaudhuri, S., Ding, B., Novik, L.: Discovering queries based on example tuples. In: SIGMOD (2014)
Google Scholar
Tan, W.C., Zhang, M., Elmeleegy, H., Srivastava, D.: Reverse engineering aggregation queries. PVLDB 10, 1394–1405 (2017)
Google Scholar
Tran, Q.T., Chan, C.Y.: How to conquer why-not questions. In: SIGMOD (2010)
Google Scholar
Tran, Q.T., Chan, C.Y., Parthasarathy, S.: Query by output. In: SIGMOD (2009)
Google Scholar
Weiss, Y.Y., Cohen, S.: Reverse engineering SPJ-queries from examples. In: PODS (2017)
Google Scholar
Zhang, M., Elmeleegy, H., Procopiuc, C.M., Srivastava, D.: Reverse engineering complex join queries. In: SIGMOD (2013)
Google Scholar

Download references

Acknowledgements

We would like to thank Meihui Zhang for sharing the code of STAR. This research is supported in part by MOE Grant R-252-000-A53-114.

Author information

Authors and Affiliations

National University of Singapore, Singapore, Singapore
Meiying Li & Chee-Yong Chan

Authors

Meiying Li
View author publications
You can also search for this author in PubMed Google Scholar
Chee-Yong Chan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chee-Yong Chan .

Editor information

Editors and Affiliations

Dankook University, Yongin, Korea (Republic of)
Yunmook Nah
Peking University, Haidian, China
Bin Cui
Sungkyunkwan University, Suwon, Korea (Republic of)
Sang-Won Lee
Department of Systems Engineering and En, The Chinese University of Hong Kong, Hong Kong, Hong Kong
Jeffrey Xu Yu
Kangwon National University, Chunchon, Korea (Republic of)
Yang-Sae Moon
Korea Advanced Institute of Science and, Daejeon, Korea (Republic of)
Steven Euijong Whang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, M., Chan, CY. (2020). Efficient Query Reverse Engineering Using Table Fragments. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12114. Springer, Cham. https://doi.org/10.1007/978-3-030-59419-0_25

Download citation

DOI: https://doi.org/10.1007/978-3-030-59419-0_25
Published: 22 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59418-3
Online ISBN: 978-3-030-59419-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics