Advertisement

Stable Coordinate Pairs in Spanish: Statistical and Structural Description

  • Igor A. Bolshakov
  • Sofia N. Galicia-Haro
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3773)

Abstract

Stable coordinate pairs (SCP) like comentarios y sugerencias ‘comments and suggestions’ or sano y salvo ‘safe and sound’ are rather frequent in texts in Spanish, though there are only few thousands of them in language. We characterize SCPs statistically by a numerical Stable Connection Index and reveal its unimodal distribution. We also propose lexical, morphologic, syntactic, and semantic categories for SCP structural description — for both a whole SCP and its components. It is argued that database containing a set of categorized SCPs facilitates several tasks of automatic NLP.. The research is based on a set of ca. 2200 Spanish coordinate pairs.

Keywords

Structural Description Content Word Word Sense Disambiguation Coordinate Pair Learn Foreign Language 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Bloomfield, L.: Language. Halt, Rinehart and Winston (1964)Google Scholar
  2. 2.
    Bolshakov, I.A.: A Method of Linguistic Steganography Based on Collocation-Verified Synonymy. In: Fridrich, J. (ed.) IH 2004. LNCS, vol. 3200, pp. 180–191. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. 3.
    Bolshakov, I.A.: An Experiment in Detection and Correction of Malapropisms through the Web. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 803–815. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  4. 4.
    Bolshakov, I.A., Gaysinski, A.N.: Slovar’ ustojčivyx sočinennyx par v russkom jazyke (in Russian). Nauchnaya i Tekhnicheskaya Informatsiya 2(4), 28–33 (1993)Google Scholar
  5. 5.
    Bolshakov, I.A., Gelbukh, A., Galicia-Haro, S.N.: Stable Coordinated Pairs in Text Processing. In: Matoušek, V., Mautner, P. (eds.) TSD 2003. LNCS (LNAI), vol. 2807, pp. 27–34. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  6. 6.
    Galicia-Haro, S.N.: Using Electronic Texts for an Annotated Corpus Building. In: 4th Mexican International Conference on Computer Science (ENC 2003), pp. 26–33 (2003)Google Scholar
  7. 7.
    Malkiel, Y.: Studies in Irreversible Binomials. Lingua 8, 113–160 (1959)CrossRefGoogle Scholar
  8. 8.
    Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)zbMATHGoogle Scholar
  9. 9.
    Mel’čuk, I.: Dependency Syntax: Theory and Practice. SUNY Press, NY (1988)Google Scholar
  10. 10.
    Mel’čuk, I.: Phrasemes in Language and Phraseology in Linguistics. In: Everaert, M., et al. (eds.) Structural and Psychological Perspectives, pp. 169–252. Lawrence Erlbaum Associates Publ., HillsdaleGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Igor A. Bolshakov
    • 1
  • Sofia N. Galicia-Haro
    • 2
  1. 1.Center for Computing Research (CIC)National Polytechnic Institute (IPN)Mexico CityMexico
  2. 2.Faculty of SciencesNational Autonomous University of Mexico (UNAM)Mexico CityMexico

Personalised recommendations