Get Your Jokes Right: Ask the Crowd

  • Joana Costa
  • Catarina Silva
  • Mário Antunes
  • Bernardete Ribeiro
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6918)


Jokes classification is an intrinsically subjective and complex task, mainly due to the difficulties related to cope with contextual constraints on classifying each joke. Nowadays people have less time to devote to search and enjoy humour and, as a consequence, people are usually interested on having a set of interesting filtered jokes that could be worth reading, that is with a high probability of make them laugh.

In this paper we propose a crowdsourcing based collective intelligent mechanism to classify humour and to recommend the most interesting jokes for further reading. Crowdsourcing is becoming a model for problem solving, as it revolves around using groups of people to handle tasks traditionally associated with experts or machines.

We put forward an active learning Support Vector Machine (SVM) approach that uses crowdsourcing to improve classification of user custom preferences. Experiments were carried out using the widely available Jester jokes dataset, with encouraging results.


Crowdsourcing Support Vector Machines Text Classification Humour classification 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Brabham, D.C.: Crowdsourcing as a Model for Problem Solving: An Introduction and Cases. Convergence: The International Journal of Research into New Media Technologies 14(1), 75–90 (2008)Google Scholar
  2. 2.
    Raykar, V., Yu, S., Zhao, L., Valadez, G., Florin, C., Bogoni, L., Moy, L.: Learning from crowds. The Journal of Machine Learning Research 99, 1297–1322 (2010)Google Scholar
  3. 3.
    Mihalcea, R., Strapparava, C.: Making computers laugh: investigations in automatic humor recognition. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 531–538 (2005)Google Scholar
  4. 4.
    Baram, Y., El-Yaniv, R., Luz, K.: Online choice of active learning algorithms. In: Proceedings of ICML 2003, 20th International Conference on Machine Learning, pp. 19–26 (2003)Google Scholar
  5. 5.
    Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1999)MATHGoogle Scholar
  6. 6.
    Joachims, T.: Learning Text Classifiers with Support Vector Machines. Kluwer Academic Publishers, Dordrecht (2002)CrossRefGoogle Scholar
  7. 7.
    Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. The Journal of Machine Learning Research 2, 45–66 (2002)MATHGoogle Scholar
  8. 8.
    Antunes, M., Silva, C., Ribeiro, B., Correia, M.: A Hybrid AIS-SVM Ensemble Approach for Text Classification. In: Dobnikar, A., Lotrič, U., Šter, B. (eds.) ICANNGA 2011, Part II. LNCS, vol. 6594, pp. 342–352. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  9. 9.
    Mihalcea, R., Strapparava, C.: Technologies That Make You Smile: Adding Humor to Text-Based Applications. IEEE Intelligent Systems 21(5), 33–39 (2006)CrossRefGoogle Scholar
  10. 10.
    Howe, J.: The Rise of Crowdsourcing. Wired (June 2006)Google Scholar
  11. 11.
    Hsueh, P.-Y., Melville, P., Sindhwani, V.: Data Quality from Crowdsourcing: A Study of Annotation Selection Criteria, pp. 1–9 (May 2009)Google Scholar
  12. 12.
    Nov, O., Arazy, O., Anderson, D.: Dusting for science: motivation and participation of digital citizen science volunteers. In: Proceedings of the 2011 iConference, pp. 68–74 (2011)Google Scholar
  13. 13.
    Surowiecki, J.: The Wisdom of Crowds. Doubleday (2004)Google Scholar
  14. 14.
    Greengard, S.: Following the crowd. Communications of the ACM 54(2), 20 (2011)CrossRefGoogle Scholar
  15. 15.
    Leimeister, J.: Collective Intelligence. In: Business & Information Systems Engineering, pp. 1–4 (2010)Google Scholar
  16. 16.
    Tarasov, A., Delany, S.: Using crowdsourcing for labelling emotional speech assets. In: ECAI - Prestigious Applications of Intelligent Systems, pp. 1–11 (2010)Google Scholar
  17. 17.
    Welinder, P., Perona, P.: Online crowdsourcing: rating annotators and obtaining cost-effective labels. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 25–32 (2010)Google Scholar
  18. 18.
    Chen, Y., Hsu, W., Liao, H.: Learning facial attributes by crowdsourcing in social media. In: WWW 2011, pp. 25–26 (2011)Google Scholar
  19. 19.
    Brew, A., Greene, D., Cunnigham, P.: The interaction between supervised learning and crowdsourcing. In: NIPS 2010 (2010)Google Scholar
  20. 20.
    Stock, O., Strapparava, C.: Getting serious about the development of computational humor. In: IJCAI 2003, pp. 59–64 (2003)Google Scholar
  21. 21.
    Binsted, K., Ritchie, G.: An implemented model of punning riddles., vol. cmp-lg (June 1994)Google Scholar
  22. 22.
    Reyes, A., Potthast, M., Rosso, P., Stein, B.: Evaluating Humor Features on Web Comments. In: Proceedings of the Seventh Conference on International Language Resources and Evaluation, LREC 2010 (May 2010)Google Scholar
  23. 23.
    Settles, B.: Active learning literature survey. CS Technical Report 1648, University of Wisconsin-Madison (2010)Google Scholar
  24. 24.
    Silva, C., Ribeiro, B.: On text-based mining with active learning and background knowledge using svm. Soft Computing - A Fusion of Foundations, Methodologies and Applications 11(6), 519–530 (2007)Google Scholar
  25. 25.
    van Rijsbergen, C.: Information Retrieval. Butterworths ed. (1979)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Joana Costa
    • 1
  • Catarina Silva
    • 1
    • 2
  • Mário Antunes
    • 1
    • 3
  • Bernardete Ribeiro
    • 2
  1. 1.Computer Science Communication and Research Centre, School of Technology and ManagementPolytechnic Institute of LeiriaPortugal
  2. 2.Department of Informatics EngineeringCenter for Informatics and Systems of the University of Coimbra (CISUC)Portugal
  3. 3.Center for Research in Advanced Computing Systems (CRACS)Portugal

Personalised recommendations