Exemplar queries: a new way of searching
Abstract
Modern search engines employ advanced techniques that go beyond the structures that strictly satisfy the query conditions in an effort to better capture the user intentions. In this work, we introduce a novel query paradigm that considers a user query as an example of the data in which the user is interested. We call these queries exemplar queries. We provide a formal specification of their semantics and show that they are fundamentally different from notions like queries by example, approximate queries and related queries. We provide an implementation of these semantics for knowledge graphs and present an exact solution with a number of optimizations that improve performance without compromising the result quality. We study two different congruence relations, isomorphism and strong simulation, for identifying the answers to an exemplar query. We also provide an approximate solution that prunes the search space and achieves considerably better time performance with minimal or no impact on effectiveness. The effectiveness and efficiency of these solutions with synthetic and real datasets are experimentally evaluated, and the importance of exemplar queries in practice is illustrated.
Keywords
Exemplar query Query answering Knowledge graph Knowledge baseNotes
Acknowledgments
This work was partially supported by the Trento RISE Big Data Project [4] and the Keystone COST action IC1302. We would like to thank the authors of [10], NeMa [29] and strong simulation [32] for kindly providing us their code. We thank Paola Quaglia for the valuable discussion and suggestions about simulation.
References
- 1.Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying search results. In: WSDM (2009)Google Scholar
- 2.Anagnostopoulos, A., Becchetti, L., Castillo, C., Gionis, A.: An optimization framework for query recommendation. In: WSDM (2010)Google Scholar
- 3.Baeza-Yates, R., Boldi, P., Castillo, C.: Generalizing pagerank: damping functions for link-based ranking algorithms. In: SIGIR (2006)Google Scholar
- 4.Bedini, I., Elser, B., Velegrakis, Y.: The trento big data platform for public administration and large companies: use cases and opportunities. In: PVLDB, vol. 6(11) (2013)Google Scholar
- 5.Beeri, C., Milo, T.: Schemas for integration and translation of structured and semi-structured data. In: ICDT. Springer, Berlin (1999)Google Scholar
- 6.Bergamaschi, S., Domnori, E., Guerra, F., Trillo Lado, R., Velegrakis, Y.: Keyword search over relational databases: a metadata approach. In: SIGMOD (2011)Google Scholar
- 7.Bergamaschi, S., Guerra, F., Rota, S., Velegrakis, Y.: A hidden markov model approach to keyword-based search over relational databases. In: ER (2011)Google Scholar
- 8.Bhatia, S., Majumdar, D., Mitra, P.: Query suggestions in the absence of query logs. In: SIGIR (2011)Google Scholar
- 9.Boldi, P., Bonchi, F., Castillo, C., Vigna, S.: Query reformulation mining: models, patterns, and applications. Inf. Retr. 14(3), 257 (2011)CrossRefGoogle Scholar
- 10.Bordino, I., De Francisci Morales, G., Weber, I., Bonchi, F.: From machu_picchu to rafting the urubamba river: anticipating information needs via the entity-query graph. In: WSDM (2013)Google Scholar
- 11.Chakrabarti, S.: Dynamic personalized pagerank in entity-relation graphs. In: WWW (2007)Google Scholar
- 12.Cook, S. A.: The complexity of theorem-proving procedures. In: Symposium on Theory of Computing (1971)Google Scholar
- 13.Dimitriadou, K., Papaemmanouil, O., Diao, Y.: Explore-by-example: an automatic query steering framework for interactive data exploration. In: SIGMOD (2014)Google Scholar
- 14.Dong, X., Halevy, A.Y., Madhavan, J.: Reference reconciliation in complex information spaces. In: SIGMOD (2005)Google Scholar
- 15.Dou, Z., Hu, S., Luo, Y., Song, R., Wen, J.: Finding dimensions for queries. In: CIKM, pp. 1311–1320 (2011)Google Scholar
- 16.Fan, W., Li, J., Ma, S., Wang, H., Wu, Y.: Graph homomorphism revisited for graph matching. PVLDB 3(1–2), 1161 (2010)Google Scholar
- 17.Gallego, M.A., Fernández, J.D., Martínez-Prieto, M.A.: and P. de la Fuente. An empirical study of real-world SPARQL queries. In USEWOD Workshop-WWW (2011)Google Scholar
- 18.Gao, X., Xiao, B., Tao, D., Li, X.: A survey of graph edit distance. Pattern Anal. Appl. 13(1), 113 (2010)MathSciNetCrossRefGoogle Scholar
- 19.Gauch, S., Smith, J.B.: Search improvement via automatic query reformulation. TOIS 9(3), 249–280 (1991)CrossRefGoogle Scholar
- 20.Google. Freebase data dumps. https://developers.google.com/freebase/data (2014)
- 21.Haveliwala, T. H.: Topic-sensitive pagerank. In: WWW (2002)Google Scholar
- 22.Henzinger, M. R., Henzinger, T. A., Kopke, P. W.: Computing simulations on finite and infinite graphs. In: FOCS (1995)Google Scholar
- 23.Hogan, A., Mellotte, M., Powell, G., Stampouli, D.: Towards fuzzy query-relaxation for rdf. In: The Semantic Web: Research and Applications, pp. 687–702. Springer, Berlin (2012)Google Scholar
- 24.Jansen, B., Booth, D., Spink, A.: Determining the informational, navigational, and transactional intent of web queries. Inf Process Manag 44, 1251 (2008)CrossRefGoogle Scholar
- 25.Jeh, G., Widom, J.: Scaling personalized web search. In: WWW (2003)Google Scholar
- 26.Kargar, M., An, A.: Keyword search in graphs: Finding r-cliques. Proc VLDB Endow 4(10), 681 (2011)CrossRefGoogle Scholar
- 27.Kasneci, G., Ramanath, M., Sozio, M., Suchanek, F.M., Weikum, G.: Star: Steiner-tree approximation in relationship graphs. In: ICDE (2009)Google Scholar
- 28.Khan, A., Li, N., Yan, X., Guan, Z., Chakraborty, S., Tao, S.: Neighborhood based fast graph search in large networks. In: SIGMOD (2011)Google Scholar
- 29.Khan, A., Wu, Y., Aggarwal, C.C., Yan, X.: Nema: Fast graph search with label similarity. In: PVLDB (2013)Google Scholar
- 30.Lao, N., Cohen, W.W.: Fast query execution for retrieval models based on path-constrained random walks. In: KDD (2010)Google Scholar
- 31.Lissandrini, M., Mottin, D., Palpanas, T., Papadimitriou, D., Velegrakis, Y.: Unleashing the power of information graphs. SIGMOD Rec. 43(4), 21 (2015)Google Scholar
- 32.Ma, S., Cao, Y., Fan, W., Huai, J., Wo, T.: Strong simulation: capturing topology in graph pattern matching. TODS 39(1), 4 (2014)MathSciNetCrossRefMATHGoogle Scholar
- 33.Mishra, C., Koudas, N.: Interactive query refinement. In: EDBT (2009)Google Scholar
- 34.Mottin, D., Lissandrini, M., Velegrakis, Y., Palpanas, T.: Exemplar queries: give me an example of what you need. Proc. VLDB Endow. 7(5), 365 (2014)CrossRefGoogle Scholar
- 35.Mottin, D., Lissandrini, M., Velegrakis, Y., Palpanas, T.: Searching with XQ: The Exemplar Query Search Engine. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 901–904. ACM, NY, USAGoogle Scholar
- 36.Mottin, D., Marascu, A., Roy, S.B., Das, G., Palpanas, T., Velegrakis, Y.: A probabilistic optimization framework for the empty-answer problem. Proc. VLDB Endow. 6(14), 1762–1773 (2013)CrossRefGoogle Scholar
- 37.Mottin, D., Palpanas, T., Velegrakis, Y.: Entity Ranking Using Click-Log Information. Intel. Data Anal. J. 17(5), 837 (2013)Google Scholar
- 38.Ngo, V. M., Cao, T. H.: Ontology-based query expansion with latently related named entities for semantic text search. In: IJIIDS (2010)Google Scholar
- 39.Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web. TR 1999-66, Stanford InfoLab (November)Google Scholar
- 40.Park, D.: Concurrency and Automata on Infinite Sequences. Springer, Berlin (1981)CrossRefMATHGoogle Scholar
- 41.Pound, J., Hudek, A. K., Ilyas, I. F., Weddell, G.: Interpreting keyword queries over web knowledge bases. In: CIKM (2012)Google Scholar
- 42.Qiu, Y., Frei, H.-P.: Concept based query expansion. In: SIGIR (1993)Google Scholar
- 43.Shannon, C.E.: A mathematical theory of communication. SIGMOBILE Mob. Comput. Commun. Rev. 5(1), 3–55 (2001)MathSciNetCrossRefGoogle Scholar
- 44.Shen, Y., Chakrabarti, K., Jones, M.: Discovering queries based on example tuples. In: SIGMOD (2014)Google Scholar
- 45.Ullmann, J.R.: An Algorithm for Subgraph Isomorphism. J ACM. 23(1), 31–42 (1976)Google Scholar
- 46.Vallet, D., Zaragoza, H.: Inferring the most important types of a query: a semantic approach. In: SIGIR, pp. 857–858 (2008)Google Scholar
- 47.Wang, X., Ding, X., Tung, A. K. H., Ying, S., Jin, H.: An efficient graph indexing method. In: ICDE, pp. 210–221 (2012)Google Scholar
- 48.Wang, X., Zhai, C.: Mining term association patterns from search logs for effective query reformulation. In: CIKM, pp. 479–488 (2008)Google Scholar
- 49.Xing, W., Ghorbani, A.: Weighted pagerank algorithm. In: CNSR, pp. 305–314 (2004)Google Scholar
- 50.Yan, X., Yu, P. S., Han, J.: Graph indexing: a frequent structure-based approach. In: SIGMOD (2004)Google Scholar
- 51.Yang, S., Wu, Y., Sun, H., Yan, X.: Schemaless and structureless graph querying. Proc. VLDB Endow. 7(7), 565 (2014)CrossRefGoogle Scholar
- 52.Zhao, P., Han, J.: On graph query optimization in large networks. VLDB J. 3(1–2), 340–351 (2010)Google Scholar
- 53.Zloof, M. M.: Query by example. In: AFIPS NCC, pp. 431–438 (1975)Google Scholar