Advertisement

Characteristics of Most Frequent Spanish Verb-Noun Combinations

  • Olga Kolesnikova
  • Alexander Gelbukh
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10061)

Abstract

We study most frequent Spanish verb-noun combinations retrieved from the Spanish Web Corpus. We present the statistics of these combinations and analyze the degree of cohesiveness of their components. For the verb-noun combinations which turned out to be collocations, we determined their semantics in the form of lexical functions. We also observed what word senses are most typical for polysemous words in the verb-noun combinations under study and determined the level of generalization which characterizes the semantics of words in the combinations, that is, at what level of the hyperonymy-hyponymy tree they are located. The data collected by us can be used in various applications of natural language processing, especially, in predictive models in which most frequent cases are taken into account.

Keywords

Verb-noun combinations Frequency Collocations Lexical functions Hyperonymy 

Notes

Acknowledgements

The authors are grateful to Vojtěch Kovář for providing us with the list of most frequent verb-noun pairs from the Spanish Web Corpus of the Sketch Engine, www.sketchengine.co.uk. The authors also appreciate the support of Mexican Government which made it possible to complete this work: SNI-CONACYT, BEIFI-IPN, SIP-IPN: grants 20162064, 20161958, and 20162204, and the EDI Program. We give special thanks to Dr. Noé Alejandro Castro-Sánchez for collecting the statistics of verb senses in Diccionario de la lengua española (DRAE).

References

  1. Cambria, E., Poria, S., Gelbukh, A., Kwok, K.: Sentic API: a common-sense based API for concept-level sentiment analysis. In: Proceedings of the 4th Workshop on Making Sense of Microposts, co-located with the 23rd International World Wide Web Conference (WWW 2014). CEUR Workshop Proceedings, vol. 1141, pp. 19–24 (2014)Google Scholar
  2. Cambria, E., Poria, S., Bajpai, R., Schuller, B.: SenticNet 4: A semantic resource for sentiment analysis based on conceptual primitives. In COLING 2016, The 26th International Conference on Computational Linguistics, pp. 2666–2677 (2016)Google Scholar
  3. Chikersal, P., Poria, S., Cambria, E., Gelbukh, A., Siong, C.E.: Modelling public sentiment in Twitter: using linguistic patterns to enhance supervised learning. In: Gelbukh, A. (ed.) CICLing 2015. LNCS, vol. 9042, pp. 49–65. Springer, Cham (2015). doi: 10.1007/978-3-319-18117-2_4 Google Scholar
  4. Derczynski, L., Lukasik, M., Srijith, P.K., Bontcheva, K., Hepple, M., Lobo, T.P., Radzimski, M.: D6. 2.1 Evaluation Report-Interim Results (2016)Google Scholar
  5. Fontenelle, T.: Using lexical functions to discover metaphors. In: Proceedings of the 6th EURALEX International Congress, pp. 271–278 (1994)Google Scholar
  6. Fontenelle, T.: Ergativity, collocations and lexical functions. In: Gellerstam, M., et al. (eds.), pp. 209–222 (1996)Google Scholar
  7. Fontenelle, T.: Using a bilingual dictionary to create semantic networks. Int. J. Lexicogr. 10(4), 275–303 (1997)CrossRefGoogle Scholar
  8. Hausmann, F.J.: Un dictionnaire des collocations est-il possible? Travaux de Linguistique et de Littérature Strasbourg 17(1), 187–195 (1979)Google Scholar
  9. Hausmann, F.J.: Was sind eigentlich Kollokationen. In: Wortverbindungen-mehr oder weniger fest, pp. 309–334 (2004)Google Scholar
  10. Kahane, S., Polguere, A.: Formal foundation of lexical functions. In: Proceedings of ACL/EACL 2001 Workshop on Collocation, pp. 8–15 (2001)Google Scholar
  11. Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., Suchomel, V.: The sketch engine: ten years on. Lexicography 1(1), 7–36 (2014)CrossRefGoogle Scholar
  12. Lemnitzer, L., Geyken, A.: Semantic modeling of collocations for lexicographic purposes. J. Cogn. Sci. 16(3), 200–223 (2015)CrossRefGoogle Scholar
  13. Majumder, N., Poria, S., Gelbukh, A., Cambria, E.: Deep learning based document modeling for personality detection from text. IEEE Intell. Syst. 32(2), 74–79 (2017)CrossRefGoogle Scholar
  14. Mel’čuk, I.A.: Lexical functions: a tool for the description of lexical relations in a lexicon. In: Wanner, L. (ed.) Lexical Functions in Lexicography and Natural Language Processing, pp. 37–102. John Benjamins Academic Publishers, Amsterdam and Philadelphia (1996)Google Scholar
  15. Mel’čuk, I.A.: Collocations and lexical functions. In: Cowie, A.P. (ed.) Phraseology. Theory, Analysis, and Applications, pp. 25–53. Clarendon Press, Oxford (1998)Google Scholar
  16. Mel’čuk, I.A.: Semantics: From Meaning to Text, vol. 3. John Benjamins Publishing Company, Amsterdam and Philadelphia (2015)Google Scholar
  17. Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)CrossRefGoogle Scholar
  18. Miller, G.A., Leacock, C., Tengi, R., Bunker, R.T.: A semantic concordance. In: Proceedings of the Workshop on Human Language Technology Association for Computational Linguistics, pp. 303–308 (1993)Google Scholar
  19. Nakagawa, H., Mori, T.: Automatic term recognition based on statistics of compound nouns and their components. Terminology 9(2), 201–219 (2003)CrossRefGoogle Scholar
  20. Pakray, P., Pal, S., Poria, S., Bandyopadhyay, S., Gelbukh, A.: JU_CSE_TAC: textual entailment recognition system at TAC RTE-6. In: System Report. Text Analysis Conference, Recognizing Textual Entailment Track (TAC RTE). Notebook (2010)Google Scholar
  21. Pakray, P., Neogi, S., Bhaskar, P., Poria, S., Bandyopadhyay, S., Gelbukh, A.: A textual entailment system using anaphora resolution. In: System Report. Text Analysis Conference, Recognizing Textual Entailment Track (TAC RTE). Notebook (2011a)Google Scholar
  22. Pakray, P., Poria, S., Bandyopadhyay, S., Gelbukh, A.: Semantic textual entailment recognition using UNL. POLIBITS 43, 23–27 (2011)CrossRefGoogle Scholar
  23. Poria, S., Gelbukh, A., Agarwal, B., Cambria, E., Howard, N.: Common sense knowledge based personality recognition from text. In: Castro, F., Gelbukh, A., González, M. (eds.) MICAI 2013. LNCS, vol. 8266, pp. 484–496. Springer, Heidelberg (2013a). doi: 10.1007/978-3-642-45111-9_42 CrossRefGoogle Scholar
  24. Poria, S., Gelbukh, A., Hussain, A., Howard, N., Das, D., Bandyopadhyay, S.: Enhanced SenticNet with affective labels for concept-based opinion mining. IEEE Intell. Syst. 28(2), 31–38 (2013b)CrossRefGoogle Scholar
  25. Poria, S., Cambria, E., Gelbukh, A., Bisio, F., Hussain, A.: Sentiment data flow analysis by means of dynamic linguistic patterns. IEEE Comput. Intell. Mag. 10(4), 26–36 (2015)CrossRefGoogle Scholar
  26. Poria, S., Cambria, E., Hazarika, D., Vij, P.: A deeper look into sarcastic tweets using deep convolutional neural networks. In: The 26th International Conference on Computational Linguistics, COLING 2016, pp. 1601–1612 (2016a)Google Scholar
  27. Poria, S., Cambria, E., Gelbukh, A.: Aspect extraction for opinion mining with a deep convolutional neural network. Knowl.-Based Syst. 108, 42–49 (2016b)Google Scholar
  28. Poria, S., Chaturvedi, I., Cambria, E., Hussain, A.: Convolutional MKL based multimodal emotion recognition and sentiment analysis. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 439–448 (2016c)Google Scholar
  29. Poria, S., Cambria, E., Bajpai, R., Hussain, A.: A review of affective computing: from unimodal analysis to multimodal fusion. Inf. Fusion 37, 98–125 (2017a)Google Scholar
  30. Poria, S., Peng, H., Hussain, A., Howard, N., Cambria, E.: Ensemble application of convolutional neural networks and multiple kernel learning for multimodal sentiment analysis. Neurocomputing (2017b, in press)Google Scholar
  31. Sag, I.A., Baldwin, T., Bond, F., Copestake, A., Flickinger, D.: Multiword expressions: a pain in the neck for NLP. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, pp. 1–15. Springer, Heidelberg (2002). doi: 10.1007/3-540-45715-1_1 CrossRefGoogle Scholar
  32. Schmid, H.: Improvements in part-of-speech tagging with an application to German. In: Proceedings of the ACL SIGDAT-Workshop (1995)Google Scholar
  33. Schmid, H.: Probabilistic part-of-speech tagging using decision trees. In: New Methods in Language Processing, p. 154. Routledge (2013)Google Scholar
  34. Sharoff, S.: Creating general-purpose corpora using automated search engine queries. In: WaCky, pp. 63–98 (2006)Google Scholar
  35. Song, S.H.: Zur Korrespondenz der NV-Kollokationen im Deutschen und Koreanischen. 언어학 44, 37–57 (2006)Google Scholar
  36. Volk, M., Scheider, G.: Comparing a statistical and a rule-based tagger for German. In: Computers, Linguistics, and Phonetics Between Language and Speech. Proceedings of 4th Conference on Natural Language Processing-KONVENS 1998 (1998)Google Scholar
  37. Woodroofe, M., Hill, B.: On Zipf’s law. J. Appl. Prob. 12, 425–434 (1975)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.ESCOMInstituto Politécnico NacionalMexico CityMexico
  2. 2.CICInstituto Politécnico NacionalMexico CityMexico

Personalised recommendations