Constructing a Generic Natural Language Interface for an XML Database

  • Yunyao Li
  • Huahai Yang
  • H. V. Jagadish
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3896)


We describe the construction of a generic natural language query interface to an XML database. Our interface can accept an arbitrary English sentence as a query, which can be quite complex and include aggregation, nesting, and value joins, among other things. This query is translated, potentially after reformulation, into an XQuery expression. The translation is based on mapping grammatical proximity of natural language parsed tokens in the parse tree of the query sentence to proximity of corresponding elements in the XML data to be retrieved. Our experimental assessment, through a user study, demonstrates that this type of natural language interface is good enough to be usable now, with no restrictions on the application domain.


Natural Language Search Task Parse Tree Aggregate Function Core Token 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Amer-Yahia, S., et al.: TeXQuery: A full-text search extension to XQuery. In: WWW (2004)Google Scholar
  2. 2.
    Androutsopoulos, I., et al.: Natural language interfaces to databases - an introduction. Journal of Language Engineering 1(1), 29–81 (1995)Google Scholar
  3. 3.
    Attardi, G., et al.: PiQASso: Pisa question answering system. In: TREC (2001)Google Scholar
  4. 4.
    Bates, M.J.: The design of browsing and berrypicking techniques for the on-line search interface. Online Review 13(5), 407–431 (1989)CrossRefGoogle Scholar
  5. 5.
    Chu-carroll, J., et al.: A hybrid approach to natural language Web search. In: EMNLP (2002)Google Scholar
  6. 6.
    Cohen, S., et al.: XSEarch: A semantic search engine for XML. In: VLDB (2003)Google Scholar
  7. 7.
    Cui, H., et al.: Question answering passage retrieval using dependency relations. In: SIGIR (2005)Google Scholar
  8. 8.
    Delden, S.V., Gomez, F.: Retrieving NASA problem reports: a case study in natural language information retrieval. Data & Knowledge Engineering 48(2), 231–246 (2004)CrossRefGoogle Scholar
  9. 9.
    Gao, J., et al.: Dependency language model for information retrieval. In: SIGIR (2004)Google Scholar
  10. 10.
    Guo, L., et al.: XRANK: Ranked keyword search over XML documents. SIGMOD (2003)Google Scholar
  11. 11.
    Hristidis, V., et al.: Keyword proximity search on XML graphs. In: ICDE (2003)Google Scholar
  12. 12.
    Hulgeri, A., et al.: Keyword search in databases. IEEE Data Engineering Bulletin 24, 22–32 (2001)Google Scholar
  13. 13.
    Jagadish, H.V., et al.: Timber: A native xml database. The VLDB Journa 11(4), 274–291 (2002)zbMATHCrossRefGoogle Scholar
  14. 14.
    Kapetanios, E., Groenewoud, P.: Query construction through meaningful suggestions of terms. In: FQAS (2002)Google Scholar
  15. 15.
    Kupper, D., et al.: NAUDA: A cooperative natural language interface to relational databases. SIGMOD Record 22(2), 529–533 (1993)CrossRefGoogle Scholar
  16. 16.
    Li, Y., et al.: Schema-Free XQuery. In: VLDB (2004)Google Scholar
  17. 17.
    Li, Y., et al.: NaLIX: an interactive natural language interface for querying XML. In: SIGMOD (2005)Google Scholar
  18. 18.
    Li, Y., et al.: Enabling Schema-Free XQuery with Meaningful Query Focus. To appear in VLDB Journal (2006)Google Scholar
  19. 19.
    Lin, D.: Dependency-based evaluation of MINIPAR. In: Workshop on the Evaluation of Parsing Systems (1998)Google Scholar
  20. 20.
    Mel’čuk, I.A.: Studies in dependency syntax. Karoma Publishers, Ann Arbor (1979)Google Scholar
  21. 21.
    Meng, F., Chu, W.: Database query formation from natural language using semantic modeling and statistical keyword meaning disambiguation. Technical Report 16, UCLA (1999)Google Scholar
  22. 22.
    Popescu, A.-M., et al.: Towards a theory of natural language interfaces to databases. In: IUI (2003)Google Scholar
  23. 23.
    Popescu, A.-M., et al.: Modern natural language interfaces to databases: Composing statistical parsing with semantic tractability. In: COLING (2004)Google Scholar
  24. 24.
    Quirk, R., et al.: A Comprehensive Grammar of the English Language. Longman, London (1985)Google Scholar
  25. 25.
    Remde, J.R., et al.: Superbook: an automatic tool for information exploration - hypertext? In: Hypertext, pp. 175–188. ACM Press, New York (1987)Google Scholar
  26. 26.
    Schmidt, A., et al.: Querying XML documents made easy: Nearest concept queries. In: ICDE (2001)Google Scholar
  27. 27.
    Shaw Jr., W., et al.: Performance standards and evaluations in IR test collections: Cluster-based retrieval modles. Information Processing and Management 33(1), 1–14 (1997)CrossRefGoogle Scholar
  28. 28.
    Sleator, D., Temperley, D.: Parsing English with a link grammar. In: International Workshop on Parsing Technologies (1993)Google Scholar
  29. 29.
    Stallard, D.: A terminological transformation for natural language question-answering systems. In: ANLP (1986)Google Scholar
  30. 30.
    Tang, L.R., Mooney, R.J.: Using multiple clause constructors in inductive logic programming for semantic parsing. In: ECML (2001)Google Scholar
  31. 31.
    The World Wide Web Consortium. XML Query Use Cases. W3C Working Draft (2003), Available at
  32. 32.
    TheWorldWideWeb Consortium. Extensible Markup Language (XML) 1.0 (Third Edition). W3C Recommendation (2004), Available at
  33. 33.
  34. 34.
    Trigoni, A.: Interactive query formulation in semistructured databases. In: FQAS (2002)Google Scholar
  35. 35.
    Woods, W., et al.: The Lunar Sciences Natural Language Information System: Final Report. In: Bolt Beranek and Newman Inc., Cambridge, MA (1972)Google Scholar
  36. 36.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Yunyao Li
    • 1
  • Huahai Yang
    • 2
  • H. V. Jagadish
    • 1
  1. 1.University of MichiganAnn ArborUSA
  2. 2.University at Albany, SUNYAlbanyUSA

Personalised recommendations