A Domain Independent Natural Language Interface to Databases Capable of Processing Complex Queries

  • Rodolfo A. Pazos Rangel
  • O. Joaquín Pérez
  • B. Juan Javier González
  • Alexander Gelbukh
  • Grigori Sidorov
  • M. Myriam J. Rodríguez
Conference paper

DOI: 10.1007/11579427_85

Part of the Lecture Notes in Computer Science book series (LNCS, volume 3789)
Cite this paper as:
Rangel R.A.P., Joaquín Pérez O., Juan Javier González B., Gelbukh A., Sidorov G., Rodríguez M.M.J. (2005) A Domain Independent Natural Language Interface to Databases Capable of Processing Complex Queries. In: Gelbukh A., de Albornoz Á., Terashima-Marín H. (eds) MICAI 2005: Advances in Artificial Intelligence. MICAI 2005. Lecture Notes in Computer Science, vol 3789. Springer, Berlin, Heidelberg

Abstract

We present a method for creating natural language interfaces to databases (NLIDB) that allow for translating natural language queries into SQL. The method is domain independent, i.e., it avoids the tedious process of configuring the NLIDB for a given domain. We automatically generate the domain dictionary for query translation using semantic metadata of the database. Our semantic representation of a query is a graph including information from database metadata. The query is translated taking into account the parts of speech of its words (obtained with some linguistic processing). Specifically, unlike most existing NLIDBs, we take seriously auxiliary words (prepositions and conjunctions) as set theory operators, which allows for processing more complex queries. Experimental results (conducted on two Spanish databases from different domains) show that treatment of auxiliary words improves correctness of translation by 12.1%. With the developed NLIDB 82of queries were correctly translated (and thus answered). Reconfiguring the NLIDB from one domain to the other took only ten minutes.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Rodolfo A. Pazos Rangel
    • 1
  • O. Joaquín Pérez
    • 1
  • B. Juan Javier González
    • 2
  • Alexander Gelbukh
    • 3
  • Grigori Sidorov
    • 3
  • M. Myriam J. Rodríguez
    • 2
  1. 1.National Center for Investigation and Technological Development (CENIDET) 
  2. 2.Technological Institute of Cd. MaderoMéxico
  3. 3.Center for Computing Research(CIC)National Polytechnic Institute(IPN)México

Personalised recommendations