Skip to main content

DILUCT: An Open-Source Spanish Dependency Parser Based on Rules, Heuristics, and Selectional Preferences

  • Conference paper
Natural Language Processing and Information Systems (NLDB 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3999))

Abstract

A method for recognizing syntactic patterns for Spanish is presented. This method is based on dependency parsing using heuristic rules to infer dependency relationships between words, and word co-occurrence statistics (learnt in an unsupervised manner) to resolve ambiguities such as prepositional phrase attachment. If a complete parse cannot be produced, a partial structure is built with some (if not all) dependency relations identified. Evaluation shows that in spite of its simplicity, the parser’s accuracy is superior to the available existing parsers for Spanish. Though certain grammar rules, as well as the lexical resources used, are specific for Spanish, the suggested approach is language-independent.

This work was done under partial support of Mexican Government (SNI, CGPI-IPN, COFAA-IPN, and PIFI-IPN). The authors cordially thank Jordi Atserias for providing the data on the comparison of TACAT parser with our system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Apresyan, Y.D., Boguslavski, I., Iomdin, L., Lazurski, A., Pertsov, N., Sannikov, V., Tsinman, L.: Linguistic Support of the ETAP-2 System, Moscow, Nauka (1989) (in Russian)

    Google Scholar 

  2. Bolshakov, I.A.: A Method of Linguistic Steganography Based on Collocationally-Verified Synonymy. In: Fridrich, J. (ed.) IH 2004. LNCS, vol. 3200, pp. 180–191. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  3. Bolshakov, I.A., Gelbukh, A.: Lexical functions in Spanish. In: Proc. CIC-98, Simposium Internacional de Computación, Mexico, pp. 383–395 (1998), http://www.gelbukh.com/CV/Publications/1998/

  4. Bolshakov, I.A., Gelbukh, A.: A Very Large Database of Collocations and Semantic Links. In: Bouzeghoub, M., Kedad, Z., Métais, E. (eds.) NLDB 2000. LNCS, vol. 1959, pp. 103–114. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  5. Bolshakov, I.A., Gelbukh, A.: On Detection of Malapropisms by Multistage Collocation Testing. In: NLDB-2003, 8th Int. Conf. on Application of Natural Language to Information Systems, pp. 28–41. Bonner Köllen Verlag (2003)

    Google Scholar 

  6. Brants, T.: TNT–A Statistical Part-of-Speech Tagger. In: Proc. ANLP 2000, 6th Applied NLP Conference, Seattle (2000)

    Google Scholar 

  7. Briscoe, T., Carroll, J., Graham, J., Copestake, A.: Relational evaluation schemes. In: Procs. of the Beyond PARSEVAL Workshop, 3rd International Conference on Language Resources and Evaluation, pp. 4–8. Las Palmas, Gran Canaria (2002)

    Google Scholar 

  8. Calvo, H., Gelbukh, A.: Natural Language Interface Framework for Spatial Object Composition Systems. Procesamiento de Lenguaje Natural 31 (2003)

    Google Scholar 

  9. Calvo, H., Gelbukh, A.: Acquiring selectional preferences from untagged text for prepositional phrase attachment disambiguation. In: Meziane, F., Métais, E. (eds.) NLDB 2004. LNCS, vol. 3136, pp. 207–216. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  10. Calvo, H., Gelbukh, A., Kilgarriff, A.: Distributional Thesaurus Versus WordNet: A Comparison of Backoff Techniques for Unsupervised PP Attachment. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 177–188. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  11. Carreras, X., Chao, I., Padró, L., Padró, M.: FreeLing: An Open-Source Suite of Language Analyzers. In: Proc. 4th Intern. Conf. on Language Resources and Evaluation (LREC 2004), Portugal (2004)

    Google Scholar 

  12. Chomsky, N.: Syntactic Structures. Mouton & Co., The Hague (1957)

    Google Scholar 

  13. Civit, M., Martí, M.A.: Estándares de anotación morfosintáctica para el español. Workshop of tools and resources for Spanish and Portuguese. In: IBERAMIA 2004 (2004)

    Google Scholar 

  14. Copestake, A., Flickinger, D., Sag, I.A.: Minimal Recursion Semantics. In: An introduction. CSLI, Stanford University (1997)

    Google Scholar 

  15. Debusmann, R., Duchier, D., Kruijff, G.-J.M.: Extensible Dependency Grammar: A New Methodology. In: Recent Advances in Dependency Grammar. Proc. of a workshop at COLING-2004, Geneve (2004)

    Google Scholar 

  16. Díaz, I., Moreno, L., Fuentes, I., Pastor, Ó.: Integrating Natural Language Techniques in OO-Method. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 560–571. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  17. Gelbukh, A., Torres, S., Calvo, H.: Transforming a Constituency Treebank into a Dependency Treebank 34, Spain (2005) (submitted to Procesamiento del Lenguaje Natural)

    Google Scholar 

  18. Gelbukh, A., Sidorov, G., Velásquez, F.: Análisis morfológico automático del español a través de generación. Escritos 28, 9–26 (2003)

    Google Scholar 

  19. Gladki, A.V.: Syntax Structures of Natural Language in Automated Dialogue Systems (in Russian). Moscow, Nauka (1985)

    Google Scholar 

  20. Mel’čuk, I.A.: Meaning-text models: a recent trend in Soviet linguistics. Annual Review of Anthropology 10, 27–62 (1981)

    Article  Google Scholar 

  21. Mel’čuk, I.A.: Dependency Syntax: Theory and Practice. State U. Press, NY (1988)

    Google Scholar 

  22. Mel’čuk, I.A.: Lexical Functions: A Tool for the Description of Lexical Relations in the Lexicon. In: Wanner, L. (ed.) Lexical Functions in Lexicography and Natural Language Processing, Benjamins, Amsterdam/Philadelphia (1996)

    Google Scholar 

  23. Montes-y-Gómez, M., Gelbukh, A.F., López-López, A.: Text Mining at Detail Level Using Conceptual Graphs. In: Priss, U., Corbett, D.R., Angelova, G. (eds.) ICCS 2002. LNCS (LNAI), vol. 2393, pp. 122–136. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  24. Montes-y-Gómez, M., López-López, A., Gelbukh, A.: Information Retrieval with Conceptual Graph Matching. In: Ibrahim, M., Küng, J., Revell, N. (eds.) DEXA 2000. LNCS, vol. 1873, pp. 312–321. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  25. Pollard, C., Sag, I.: Head-Driven Phrase Structure Grammar. University of Chicago Press, Chicago (1994)

    Google Scholar 

  26. Sag, I., Wasow, T., Bender, E.M.: Syntactic Theory. A Formal Introduction, 2nd edn. CSLI Publications, Stanford, CA (2003)

    MATH  Google Scholar 

  27. Sowa, J.F.: Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley Publishing Co., Reading (1984)

    MATH  Google Scholar 

  28. Steele, J.: Meaning-Text Theory. Linguistics, Lexicography, and Implications. Univ. of Ottawa Press, Ottawa (1990)

    Google Scholar 

  29. Tapanainen, P.: Parsing in two frameworks: finite-state and functional dependency grammar. Academic Dissertation. University of Helsinki, Language Technology, Department of General Linguistics, Faculty of Arts (1999)

    Google Scholar 

  30. Tesnière, L.: Eléments de syntaxe structurale. Librairie Klincksieck. Paris (1959)

    Google Scholar 

  31. Yuret, D.: Discovery of Linguistic Relations Using Lexical Attraction, PhD thesis, MIT (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Calvo, H., Gelbukh, A. (2006). DILUCT: An Open-Source Spanish Dependency Parser Based on Rules, Heuristics, and Selectional Preferences. In: Kop, C., Fliedl, G., Mayr, H.C., Métais, E. (eds) Natural Language Processing and Information Systems. NLDB 2006. Lecture Notes in Computer Science, vol 3999. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11765448_15

Download citation

  • DOI: https://doi.org/10.1007/11765448_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-34616-6

  • Online ISBN: 978-3-540-34617-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics