Is This a Joke? Detecting Humor in Spanish Tweets

  • Santiago CastroEmail author
  • Matías Cubero
  • Diego Garat
  • Guillermo Moncecchi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10022)


While humor has been historically studied from a psychological, cognitive and linguistic standpoint, its study from a computational perspective is an area yet to be explored in Computational Linguistics. There exist some previous works, but a characterization of humor that allows its automatic recognition and generation is far from being specified. In this work we build a crowdsourced corpus of labeled tweets, annotated according to its humor value, letting the annotators subjectively decide which are humorous. A humor classifier for Spanish tweets is assembled based on supervised learning, reaching a precision of 84 % and a recall of 69 %.


Humor Computational humor Humor recognition Machine learning Natural language processing 


  1. 1.
    International Journal of Humor Research: HUMOR (1988). Visited May 2015
  2. 2.
    Raskin, V.: Semantic Mechanisms of Humor. Springer, Heidelberg (1985)Google Scholar
  3. 3.
    Mulder, M.P., Nijholt, A.: Humour Research: State of Art. Technical report TR-CTIT-02-34, Enschede: Centre for Telematics and Information Technology University of Twente (2002)Google Scholar
  4. 4.
    Gruner, C.: The Game of Humor: A Comprehensive Theory of Why We Laugh. Transaction Publishers, Piscataway (2000)Google Scholar
  5. 5.
    Freud, S., Strachey, J.: Jokes and Their Relation to the Unconscious (1905)Google Scholar
  6. 6.
    Minsky, M.: Jokes and the logic of the cognitive unconscious. In: Vaina, L., Hintikka, J. (eds.) Cognitive Constraints on Communication, vol. 18, pp. 175–200. Springer, Heidelberg (1980)CrossRefGoogle Scholar
  7. 7.
    Rutter, J.: Stand-up as interaction: performance and audience in comedy venues. Citeseer (1997)Google Scholar
  8. 8.
    Attardo, S., Raskin, V.: Script theory revis(it)ed: joke similarity and joke representation model. Humor: Int. J. Humor Res. 4, 293–347 (1991)CrossRefGoogle Scholar
  9. 9.
    Ruch, W., Attardo, S., Raskin, V.: Toward an empirical verification of the general theory of verbal humor. HUMOR: Int. J. Humor Res. 6(2), 123–136 (1993)CrossRefGoogle Scholar
  10. 10.
    Mihalcea, R., Strapparava, C.: Making computers laugh: investigations in automatic humor recognition. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005, pp. 531–538. Association for Computational Linguistics, Vancouver (2005)Google Scholar
  11. 11.
    Reyes, A., Buscaldi, D., Rosso, P.: An analysis of the impact of ambiguity on automatic humour recognition. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 162–169. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  12. 12.
    Reyes, A., Rosso, P., Martí, M.A., Taulé, M.: Características y rasgos afectivos del humor: un estudio de reconocimiento automático del humor en textos escolares en catalán. Procesamiento del Lenguaje Nat. 43, 235–243 (2009)Google Scholar
  13. 13.
    Strapparava, C., Valitutti, A.: WordNet affect: an affective extension of WordNet. In: LREC, pp. 1083–1086 (2004)Google Scholar
  14. 14.
    Sjöbergh, J., Araki, K.: Recognizing humor without recognizing meaning. In: Masulli, F., Mitra, S., Pasi, G. (eds.) WILF 2007. LNCS (LNAI), vol. 4578, pp. 469–476. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  15. 15.
    Basili, R., Zanzotto, F.M.: Parsing engineering and empirical robustness. Nat. Lang. Eng. 8(3), 97–120 (2002)Google Scholar
  16. 16.
    Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)zbMATHGoogle Scholar
  17. 17.
    Mihalcea, R.F., Pulman, S.: Characterizing humour: an exploration of features in humorous texts. In: Gelbukh, A. (ed.) CICLing 2007. LNCS, vol. 4394, pp. 337–347. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  18. 18.
    Mihalcea, R., Strapparava, C.: Learning to laugh (automatically): computational models for humor recognition. Comput. Intell. 22(2), 126–142 (2006)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Padro, L., Stanilovsky, E.: FreeLing 3.0: towards wider multilinguality. In: Proceedings of the Language Resources and Evaluation Conference (LREC 2012), Istanbul, Turkey (2012)Google Scholar
  20. 20.
    Mihalcea, R., Strapparava, C.: Bootstrapping for fun: web-based construction of large data sets for humor recognition. In: Proceedings of the Workshop on Negotiation, Behaviour and Language (FINEXIN 2005), pp. 84–93 (2005)Google Scholar
  21. 21.
    Gonzalez-Agirre, A., Laparra, E., Rigau, G.: Multilingual central repository version 3.0: upgrading a very large lexical knowledge base. In: Proceedings of the 6th Global WordNet Conference (GWC 2012), Matsue (2012)Google Scholar
  22. 22.
    Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1–3), 389–422 (2002)CrossRefzbMATHGoogle Scholar
  23. 23.
    Reese, S., Boleda, G., Cuadros, M., Padró, L., Rigau, G.: Wikicorpus: a word-sense disambiguated multilingual wikipedia corpus. In: Proceedings of 7th Language Resources and Evaluation Conference (LREC 2010), La Valleta, Malta (2010)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Santiago Castro
    • 1
    Email author
  • Matías Cubero
    • 1
  • Diego Garat
    • 1
  • Guillermo Moncecchi
    • 1
  1. 1.Universidad de la RepúblicaMontevideoUruguay

Personalised recommendations