World Wide Web

, Volume 17, Issue 1, pp 127–159 | Cite as

XML keyword search with promising result type recommendations

  • Jianxin LiEmail author
  • Chengfei Liu
  • Rui Zhou
  • Wei Wang


Keyword search enables inexperienced users to easily search XML database with no specific knowledge of complex structured query languages and XML data schemas. Existing work has addressed the problem of selecting data nodes that match keywords and connecting them in a meaningful way, e.g., SLCA and ELCA. However, it is time-consuming and unnecessary to serve all the connected subtrees to the users because in general the users are only interested in part of the relevant results. In this paper, we propose a new keyword search approach which basically utilizes the statistics of underlying XML data to decide the promising result types and then quickly retrieves the corresponding results with the help of selected promising result types. To guarantee the quality of the selected promising result types, we measure the correlations between result types and a keyword query by analyzing the distribution of relevant keywords and their structures within the XML data to be searched. In addition, relevant result types can be efficiently computed without keyword query evaluation and any schema information. To directly return top-k keyword search results that conform to the suggested promising result types, we design two new algorithms to adapt to the structural sensitivity of the keyword nodes over the keyword search results. Lastly, we implement all proposed approaches and present the relevant experimental results to show the effectiveness of our approach.


XML data management XML keyword query Result type suggestion 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Arampatzis, A.T., Kamps, J.: A study of query length. In: SIGIR, pp. 811–812 (2008)Google Scholar
  2. 2.
    Bao, Z., Ling, T.W., Chen, B., Lu, J.: Effective xml keyword search with relevance oriented ranking. In: ICDE, pp. 517–528 (2009)Google Scholar
  3. 3.
    Cohen, S., Mamou, J., Kanza, Y., Sagiv, Y.: XSEarch: a semantic search engine for XML. In: VLDB, pp. 45–56 (2003)Google Scholar
  4. 4.
    Denoyer, L., Gallinari, P.: The wikipedia xml corpus. SIGIR Forum 40(1), 64–69 (2006)CrossRefGoogle Scholar
  5. 5.
    Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: PODS, pp. 102–113 (2001)Google Scholar
  6. 6.
    Florescu, D., Kossmann, D., Manolescu, I.: Integrating keyword search into XML query processing. Comput. Networks 33(1–6), 119–135 (2000)CrossRefGoogle Scholar
  7. 7.
    Guo, L., Shao, F., Botev, C., Shanmugasundaram, J.: XRANK: ranked keyword search over XML documents. In: SIGMOD Conference, pp. 16–27 (2003)Google Scholar
  8. 8.
    Hadjieleftheriou, M., Chandel, A., Koudas, N., Srivastava, D.: Fast indexes and algorithms for set similarity selection queries. In: ICDE, pp. 267–276 (2008)Google Scholar
  9. 9.
    Hristidis, V., Papakonstantinou, Y., Balmin, A.: Keyword proximity search on xml graphs. In: ICDE, pp. 367–378 (2003)Google Scholar
  10. 10.
    iProspect: Iprospect natural seo keyword length study (Nuvember 2004). Technical report, iProspect (2004)Google Scholar
  11. 11.
    Kong, L., Gilleron, R., Lemay, A.: Retrieving meaningful relaxed tightest fragments for xml keyword search. In: EDBT, pp. 815–826 (2009)Google Scholar
  12. 12.
    Koutrika, G., Simitsis, A., Ioannidis, Y.E.: Précis: the essence of a query answer. In: ICDE, pp. 69–78 (2006)Google Scholar
  13. 13.
    Kulkarni, S., Caragea, D.: Computation of the semantic relatedness between words using concept clouds. In: KDIR, pp. 183–188 (2009)Google Scholar
  14. 14.
    Li, Y., Yu, C., Jagadish, H.V.: Schema-free XQuery. In: VLDB, pp. 72–83 (2004)Google Scholar
  15. 15.
    Li, G., Feng, J., Wang, J., Zhou, L.: Effective keyword search for valuable lcas over xml documents. In: CIKM, pp. 31–40 (2007)Google Scholar
  16. 16.
    Li, J., Liu, C., Zhou, R.,Wang, W.: Suggestion of promising result types forXMLkeyword search. In: EDBT, pp. 561–572 (2010)Google Scholar
  17. 17.
    Li, J., Liu, C., Zhou, R., Wang, W.: Top-k keyword search over probabilistic XML data. In: ICDE, pp. 673–684 (2011)Google Scholar
  18. 18.
    Liu, Z., Chen, Y.: Identifying meaningful return information for xml keyword search. In: SIGMOD Conference, pp. 329–340 (2007)Google Scholar
  19. 19.
    Liu, Z., Chen, Y.: Reasoning and identifying relevant matches for xml keyword search. PVLDB 1(1), 921–932 (2008)Google Scholar
  20. 20.
    Liu, Z., Sun, P., Chen, Y.: Structured search result differentiation. PVLDB 2(1), 313–324 (2009)Google Scholar
  21. 21.
    Liu, C., Li, J., Yu, J.X., Zhou, R.: Adaptive relaxation for querying heterogeneous XML data sources. Inf. Syst. 35(6), 688–707 (2010)CrossRefGoogle Scholar
  22. 22.
    Polyzotis, N., Garofalakis, M.N.: Structure and value synopses for xml data graphs. In: VLDB, pp. 466–477 (2002)Google Scholar
  23. 23.
    Polyzotis, N., Garofalakis, M.N., Ioannidis, Y.E.: Selectivity estimation for xml twigs. In: ICDE, pp. 264–275 (2004)Google Scholar
  24. 24.
    Sun, C., Chan, C.Y., Goenka, A.K.: Multiway slca-based keyword search in xml data. In: WWW, pp. 1043–1052 (2007)Google Scholar
  25. 25.
    Termehchy, A., Winslett, M.: Effective, design-independent xml keyword search. In: CIKM, pp. 107–116, (2009)Google Scholar
  26. 26.
    Termehchy, A., Winslett, M.: Using structural information in xml keyword search effectively. ACM Trans. Database Syst. 36(1), 4 (2011)CrossRefGoogle Scholar
  27. 27.
    Xu, Y., Papakonstantinou, Y.: Efficient keyword search for smallest LCAs in XML databases. In: SIGMOD Conference, pp. 537–538 (2005)Google Scholar
  28. 28.
    Xu, Y., Papakonstantinou, Y.: Efficient lca based keyword search in xml data. In: EDBT, pp. 535–546 (2008)Google Scholar
  29. 29.
    Zhou, R., Liu, C., Li, J.: Fast elca computation for keyword queries on xml data. In: EDBT, pp. 549–560 (2010)Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Faculty of Information & Communication TechnologiesSwinburne University of TechnologyMelbourneAustralia
  2. 2.School of Computer Science and EngineeringUniversity of New South WalesSydneyAustralia

Personalised recommendations