The VLDB Journal

, Volume 25, Issue 6, pp 741–765 | Cite as

Exemplar queries: a new way of searching

  • Davide Mottin
  • Matteo Lissandrini
  • Yannis Velegrakis
  • Themis Palpanas
Regular Paper

Abstract

Modern search engines employ advanced techniques that go beyond the structures that strictly satisfy the query conditions in an effort to better capture the user intentions. In this work, we introduce a novel query paradigm that considers a user query as an example of the data in which the user is interested. We call these queries exemplar queries. We provide a formal specification of their semantics and show that they are fundamentally different from notions like queries by example, approximate queries and related queries. We provide an implementation of these semantics for knowledge graphs and present an exact solution with a number of optimizations that improve performance without compromising the result quality. We study two different congruence relations, isomorphism and strong simulation, for identifying the answers to an exemplar query. We also provide an approximate solution that prunes the search space and achieves considerably better time performance with minimal or no impact on effectiveness. The effectiveness and efficiency of these solutions with synthetic and real datasets are experimentally evaluated, and the importance of exemplar queries in practice is illustrated.

Keywords

Exemplar query Query answering Knowledge graph Knowledge base 

References

  1. 1.
    Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying search results. In: WSDM (2009)Google Scholar
  2. 2.
    Anagnostopoulos, A., Becchetti, L., Castillo, C., Gionis, A.: An optimization framework for query recommendation. In: WSDM (2010)Google Scholar
  3. 3.
    Baeza-Yates, R., Boldi, P., Castillo, C.: Generalizing pagerank: damping functions for link-based ranking algorithms. In: SIGIR (2006)Google Scholar
  4. 4.
    Bedini, I., Elser, B., Velegrakis, Y.: The trento big data platform for public administration and large companies: use cases and opportunities. In: PVLDB, vol. 6(11) (2013)Google Scholar
  5. 5.
    Beeri, C., Milo, T.: Schemas for integration and translation of structured and semi-structured data. In: ICDT. Springer, Berlin (1999)Google Scholar
  6. 6.
    Bergamaschi, S., Domnori, E., Guerra, F., Trillo Lado, R., Velegrakis, Y.: Keyword search over relational databases: a metadata approach. In: SIGMOD (2011)Google Scholar
  7. 7.
    Bergamaschi, S., Guerra, F., Rota, S., Velegrakis, Y.: A hidden markov model approach to keyword-based search over relational databases. In: ER (2011)Google Scholar
  8. 8.
    Bhatia, S., Majumdar, D., Mitra, P.: Query suggestions in the absence of query logs. In: SIGIR (2011)Google Scholar
  9. 9.
    Boldi, P., Bonchi, F., Castillo, C., Vigna, S.: Query reformulation mining: models, patterns, and applications. Inf. Retr. 14(3), 257 (2011)CrossRefGoogle Scholar
  10. 10.
    Bordino, I., De Francisci Morales, G., Weber, I., Bonchi, F.: From machu_picchu to rafting the urubamba river: anticipating information needs via the entity-query graph. In: WSDM (2013)Google Scholar
  11. 11.
    Chakrabarti, S.: Dynamic personalized pagerank in entity-relation graphs. In: WWW (2007)Google Scholar
  12. 12.
    Cook, S. A.: The complexity of theorem-proving procedures. In: Symposium on Theory of Computing (1971)Google Scholar
  13. 13.
    Dimitriadou, K., Papaemmanouil, O., Diao, Y.: Explore-by-example: an automatic query steering framework for interactive data exploration. In: SIGMOD (2014)Google Scholar
  14. 14.
    Dong, X., Halevy, A.Y., Madhavan, J.: Reference reconciliation in complex information spaces. In: SIGMOD (2005)Google Scholar
  15. 15.
    Dou, Z., Hu, S., Luo, Y., Song, R., Wen, J.: Finding dimensions for queries. In: CIKM, pp. 1311–1320 (2011)Google Scholar
  16. 16.
    Fan, W., Li, J., Ma, S., Wang, H., Wu, Y.: Graph homomorphism revisited for graph matching. PVLDB 3(1–2), 1161 (2010)Google Scholar
  17. 17.
    Gallego, M.A., Fernández, J.D., Martínez-Prieto, M.A.: and P. de la Fuente. An empirical study of real-world SPARQL queries. In USEWOD Workshop-WWW (2011)Google Scholar
  18. 18.
    Gao, X., Xiao, B., Tao, D., Li, X.: A survey of graph edit distance. Pattern Anal. Appl. 13(1), 113 (2010)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Gauch, S., Smith, J.B.: Search improvement via automatic query reformulation. TOIS 9(3), 249–280 (1991)CrossRefGoogle Scholar
  20. 20.
    Google. Freebase data dumps. https://developers.google.com/freebase/data (2014)
  21. 21.
    Haveliwala, T. H.: Topic-sensitive pagerank. In: WWW (2002)Google Scholar
  22. 22.
    Henzinger, M. R., Henzinger, T. A., Kopke, P. W.: Computing simulations on finite and infinite graphs. In: FOCS (1995)Google Scholar
  23. 23.
    Hogan, A., Mellotte, M., Powell, G., Stampouli, D.: Towards fuzzy query-relaxation for rdf. In: The Semantic Web: Research and Applications, pp. 687–702. Springer, Berlin (2012)Google Scholar
  24. 24.
    Jansen, B., Booth, D., Spink, A.: Determining the informational, navigational, and transactional intent of web queries. Inf Process Manag 44, 1251 (2008)CrossRefGoogle Scholar
  25. 25.
    Jeh, G., Widom, J.: Scaling personalized web search. In: WWW (2003)Google Scholar
  26. 26.
    Kargar, M., An, A.: Keyword search in graphs: Finding r-cliques. Proc VLDB Endow 4(10), 681 (2011)CrossRefGoogle Scholar
  27. 27.
    Kasneci, G., Ramanath, M., Sozio, M., Suchanek, F.M., Weikum, G.: Star: Steiner-tree approximation in relationship graphs. In: ICDE (2009)Google Scholar
  28. 28.
    Khan, A., Li, N., Yan, X., Guan, Z., Chakraborty, S., Tao, S.: Neighborhood based fast graph search in large networks. In: SIGMOD (2011)Google Scholar
  29. 29.
    Khan, A., Wu, Y., Aggarwal, C.C., Yan, X.: Nema: Fast graph search with label similarity. In: PVLDB (2013)Google Scholar
  30. 30.
    Lao, N., Cohen, W.W.: Fast query execution for retrieval models based on path-constrained random walks. In: KDD (2010)Google Scholar
  31. 31.
    Lissandrini, M., Mottin, D., Palpanas, T., Papadimitriou, D., Velegrakis, Y.: Unleashing the power of information graphs. SIGMOD Rec. 43(4), 21 (2015)Google Scholar
  32. 32.
    Ma, S., Cao, Y., Fan, W., Huai, J., Wo, T.: Strong simulation: capturing topology in graph pattern matching. TODS 39(1), 4 (2014)MathSciNetCrossRefMATHGoogle Scholar
  33. 33.
    Mishra, C., Koudas, N.: Interactive query refinement. In: EDBT (2009)Google Scholar
  34. 34.
    Mottin, D., Lissandrini, M., Velegrakis, Y., Palpanas, T.: Exemplar queries: give me an example of what you need. Proc. VLDB Endow. 7(5), 365 (2014)CrossRefGoogle Scholar
  35. 35.
    Mottin, D., Lissandrini, M., Velegrakis, Y., Palpanas, T.: Searching with XQ: The Exemplar Query Search Engine. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 901–904. ACM, NY, USAGoogle Scholar
  36. 36.
    Mottin, D., Marascu, A., Roy, S.B., Das, G., Palpanas, T., Velegrakis, Y.: A probabilistic optimization framework for the empty-answer problem. Proc. VLDB Endow. 6(14), 1762–1773 (2013)CrossRefGoogle Scholar
  37. 37.
    Mottin, D., Palpanas, T., Velegrakis, Y.: Entity Ranking Using Click-Log Information. Intel. Data Anal. J. 17(5), 837 (2013)Google Scholar
  38. 38.
    Ngo, V. M., Cao, T. H.: Ontology-based query expansion with latently related named entities for semantic text search. In: IJIIDS (2010)Google Scholar
  39. 39.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web. TR 1999-66, Stanford InfoLab (November)Google Scholar
  40. 40.
    Park, D.: Concurrency and Automata on Infinite Sequences. Springer, Berlin (1981)CrossRefMATHGoogle Scholar
  41. 41.
    Pound, J., Hudek, A. K., Ilyas, I. F., Weddell, G.: Interpreting keyword queries over web knowledge bases. In: CIKM (2012)Google Scholar
  42. 42.
    Qiu, Y., Frei, H.-P.: Concept based query expansion. In: SIGIR (1993)Google Scholar
  43. 43.
    Shannon, C.E.: A mathematical theory of communication. SIGMOBILE Mob. Comput. Commun. Rev. 5(1), 3–55 (2001)MathSciNetCrossRefGoogle Scholar
  44. 44.
    Shen, Y., Chakrabarti, K., Jones, M.: Discovering queries based on example tuples. In: SIGMOD (2014)Google Scholar
  45. 45.
    Ullmann, J.R.: An Algorithm for Subgraph Isomorphism. J ACM. 23(1), 31–42 (1976)Google Scholar
  46. 46.
    Vallet, D., Zaragoza, H.: Inferring the most important types of a query: a semantic approach. In: SIGIR, pp. 857–858 (2008)Google Scholar
  47. 47.
    Wang, X., Ding, X., Tung, A. K. H., Ying, S., Jin, H.: An efficient graph indexing method. In: ICDE, pp. 210–221 (2012)Google Scholar
  48. 48.
    Wang, X., Zhai, C.: Mining term association patterns from search logs for effective query reformulation. In: CIKM, pp. 479–488 (2008)Google Scholar
  49. 49.
    Xing, W., Ghorbani, A.: Weighted pagerank algorithm. In: CNSR, pp. 305–314 (2004)Google Scholar
  50. 50.
    Yan, X., Yu, P. S., Han, J.: Graph indexing: a frequent structure-based approach. In: SIGMOD (2004)Google Scholar
  51. 51.
    Yang, S., Wu, Y., Sun, H., Yan, X.: Schemaless and structureless graph querying. Proc. VLDB Endow. 7(7), 565 (2014)CrossRefGoogle Scholar
  52. 52.
    Zhao, P., Han, J.: On graph query optimization in large networks. VLDB J. 3(1–2), 340–351 (2010)Google Scholar
  53. 53.
    Zloof, M. M.: Query by example. In: AFIPS NCC, pp. 431–438 (1975)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Davide Mottin
    • 1
  • Matteo Lissandrini
    • 2
  • Yannis Velegrakis
    • 2
  • Themis Palpanas
    • 3
  1. 1.Hasso Plattner InstitutePotsdamGermany
  2. 2.University of TrentoTrentoItaly
  3. 3.Paris Descartes UniversityParisFrance

Personalised recommendations