Prepositions and Conjunctions in a Natural Language Interfaces to Databases

  • J. Javier González B.
  • Rodolfo A. Pazos R.
  • Alexander Gelbukh
  • Grigori Sidorov
  • Hector Fraire H.
  • I. Cristina Cruz C.
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4743)

Abstract

This paper present the treatment of prepositions and conjunctions in natural language interfaces to databases (NLIDB) that allows better translation of queries expressed in natural language into formal languages. Prepositions and conjunctions weren’t sufficiently studied for their usage in NLIDBs, because most of the NLIDBs just look for keywords in the sentences and focus their analysis on nouns and verbs getting rid of auxiliary words in the query. This paper shows that prepositions and conjunctions can be represented as operations using formal set theory. Additionally, since prepositions and conjunctions keep their meaning in any context, their treatment is domain independent. In our experiments we used Spanish language. We validate our approach using two databases; Northwind and Pubs of SQL Server, with a corpus of 198 different queries for the first one and 70 queries for the second one. The 84% of queries were translated correctly for the database Northwind and 80% for Pubs.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Popescu, A.M., Etzioni, O., Kautz, H.: Towards a Theory of Natural Language Interfaces to Databases. In: Proceedings of the 2003 International Conference on Intelligent User Interfaces, ACM Press, New York (2003)Google Scholar
  2. 2.
    Orea, M.Q.: Interfaz en Espanol para Recuperacion de Informacion en una Base de Datos Geografica. B.S. thesis. Universidad de las Americas. Puebla (2001)Google Scholar
  3. 3.
    AVENTINUS - Advanced Information System for Multinational Drug Enforcement, http://www.dcs.shef.ac.uk/nlp/funded/aventinus.html
  4. 4.
    Areas de Investigacion; Procesamiento de Lenguaje Natural (1998), http://gplsi.dlsi.ua.es/gplsi/areas.htm
  5. 5.
    Sidorov, G.: Problemas actuales de Lingüistica Computacional. Revista Digital Universitaria 2(1) (2001), http://www.revista.unam.mx/vol.2/num1/art1/
  6. 6.
    Stallard, M.S., Bobrow, D., Schwartz, R.: A Fully Statistical Approach to Natural Language Interfaces. In: Proc. 34th Annual Meeting of the Association for Computational Linguistics (1996)Google Scholar
  7. 7.
    Minker, W.: Stochastically-Based Natural Language Understanding Across Task and Languages. In: Proc. of EuroSpeech 1997, Rodas, Greece (1997), http://citeseer.nj.nec.com/
  8. 8.
    Moreno, L., Molina, A.: Preliminares y Tendencias en el Procesamiento del Lenguaje Natural, Departamento de Sistemas Informaticos y Computacion. Universidad Politecnica de ValenciaGoogle Scholar
  9. 9.
    Profesor en Linea. Gramatica y Ortografia: http://www.profesorenlinea.cl/castellano/oracionpartesdela.htm
  10. 10.
    Enciclopedia Libre. Enciclopedia Libre Universal en Espanol (2004), http://enciclopedia.us.es/wiki.phtml
  11. 11.
    Prytz, O.: Notas sobre las Preposiciones Simples en Espanol Moderno (1994), http://www.digbib.uio.no/roman/Art/Rf1-94-1/Prytzb.pdf
  12. 12.
    Martin V., G.: Curso de Redaccion. Teoria y Practica de la Composicion del Estilo. Editorial Thomson. 33a edicion. Madrid, España. (2002).Google Scholar
  13. 13.
    Martin, A.: Una Propuesta de Codificacion Morfosintactica para Corpus de Referencia en Lengua Espanola. In: Martin, A. (ed.) Estudios de Lingüistica Espanola (ELiEs). Publicacion periodica de monografias sobre lingüistica espanola (1999), http://elies.rediris.es/elies3/cap31a.htm
  14. 14.
    Bridge, G., Harlow, S.: An Introduction to Computational Linguistics, Intelligent System Group. Department of Computer Science. University of YorkGoogle Scholar
  15. 15.
    Meng, F., Chu, W.W.: Database Query Formation from Natural Language Using Semantic Modeling and Statistical Keyword Meaning Disambiguation. Computer Science Department. University of CaliforniaGoogle Scholar
  16. 16.
    InBase-Online. English queries to personnel DB. Russian Research Institute of Artificial Intelligence (2001), http://www.inbase.artint.ru/nl/kadry-eng.asp
  17. 17.
    Montero, J.M.: Sistemas de conversion texto voz. B.S.thesis. Universidad Politecnica de Madrid. http://lorien.die.upm.es/~juancho

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • J. Javier González B.
    • 1
  • Rodolfo A. Pazos R.
    • 2
  • Alexander Gelbukh
    • 3
  • Grigori Sidorov
    • 3
  • Hector Fraire H.
    • 1
  • I. Cristina Cruz C.
    • 1
  1. 1.Instituto Tecnológico de Ciudad MaderoMéxico
  2. 2.Centro Nacional de Investigación y Desarrollo Tecnológico (CENIDET)México
  3. 3.Centro de Investigación en Computación 

Personalised recommendations