Effective Keyword Search in XML Documents Based on MIU

  • Jianjun Xu
  • Jiaheng Lu
  • Wei Wang
  • Baile Shi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3882)


Keyword search is an effective approach for most users to search for information because they do not need to learn complex query languages or the underlying structures of the data. This paper focuses on effective keyword search in XML documents which are modeled as labeled trees. We first analyze the problems caused by the refinement of result granularity during XML keyword search and then propose to partition an XML document into XML fragments with the granularity of Minimal Information Unit (MIU). Furthermore, we present efficient index structures and the corresponding search algorithms. Finally, our comprehensive experiments demonstrate the benefits of our method over previously proposed methods in terms of result quality, index size and execution time.


Search Result Irrelevant Information Twig Pattern Containment Edge Total Search Time 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Guo, L., Shao, F., Botev, C., Shanmugasundaram, J.: XRANK: Ranked Keyword Search over XML Documents. In: Proc. of SIGMOD (2003)Google Scholar
  2. 2.
    Hristidis, V., Papakonstantinou, Y., Balmin, A.: Keyword Proximity Search on XML Graphs. In: Proc. of ICDE (2003)Google Scholar
  3. 3.
    Cohen, S., Mamou, J., Kanza, Y., Sagiv, Y.: XSEarch: A Semantic Search Engine for XML. In: Aberer, K., Koubarakis, M., Kalogeraki, V. (eds.) VLDB 2003. LNCS, vol. 2944, Springer, Heidelberg (2004)Google Scholar
  4. 4.
    Xu, Y., Papakonstantinou, Y.: Efficient Keyword Search for Smallest LCAs in XML Databases. In: Proc. of SIGMOD (2005)Google Scholar
  5. 5.
    Florescu, D., Kossmann, D., Manolescu, I.: Integrating Keyword Search into XML Query Processing. In: Proc. of IJCTN (2000)Google Scholar
  6. 6.
    Fuhr, N., Großjohann, K.: XIRQL: A Query Language for Information Retrieval in XML Documents. In: Proc. of SIGIR (2001)Google Scholar
  7. 7.
    Theobald, A., Weikum, G.: The Index-based XXL Search Engine for Querying XML Data with Relevance Ranking. In: Proc. of ICEDT (2002)Google Scholar
  8. 8.
    Hristidis, V., Papakonstantinou, Y.: DISCOVER: Keyword Search in Relational Databases. In: Bressan, S., Chaudhri, A.B., Li Lee, M., Yu, J.X., Lacroix, Z. (eds.) CAiSE 2002 and VLDB 2002. LNCS, vol. 2590, Springer, Heidelberg (2003)Google Scholar
  9. 9.
    Agrawal, S., Chaudhuri, S., Das, G.: DBXplorer: A System for Keyword-Based Search over Relational Databases. In: Proc. of ICDE (2002)Google Scholar
  10. 10.
    Bhalotia, G., Hulgeri, A., Nakhe, C., Chakrabarti, S., Sudarshan, S.: Keyword Searching and Browsing in Databases using BANKS. In: Proc. of ICDE (2002)Google Scholar
  11. 11.
    Al-Khalifa, S., Yu, C., Jagadish, H.V.: Querying Structured Text in an XML Database. In: Proc. of SIGMOD (2003)Google Scholar
  12. 12.
    Lu, J., Ling, T.W., Chan, C.-Y., Chen, T.: From Region Encoding To Extended Dewey: On Efficient Processing of XML Twig Pattern Matching. In: Proc. of VLDB (2005)Google Scholar
  13. 13.
    World Wide Web Consortium, http://www.w3.org
  14. 14.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jianjun Xu
    • 1
  • Jiaheng Lu
    • 2
  • Wei Wang
    • 1
  • Baile Shi
    • 1
  1. 1.Department of Computing and Information TechnologyFudan UniversityChina
  2. 2.Department of Computer ScienceNational University of SingaporeSingapore

Personalised recommendations