Investigating the Role of Emotion-Based Features in Author Gender Classification of Text

  • Calkin Suero Montero
  • Myriam Munezero
  • Tuomo Kakkonen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8404)


Research has shown that writing styles are influenced by an extensive array of factors that includes text genre and author’s gender. Going beyond the analysis of linguistic features, such as n-grams, stylometric variables and word categories, this paper presents an exploratory study of the role that emotions expressed in writing play to aid discriminating author gender in different text genres. In this work, the gender classification task is seen as a binary classification problem where discriminating features are taken from a vectorial space that includes emotion-based features. Results show that by exploiting the emotional information present in personal journal (diary) texts, up to 80% cross-validation accuracy with support vector machine (SVM) algorithm can be reached. Over 75% cross-validation accuracy is reached when classifying the author gender of blog texts. Our findings show positive implications of emotion-based features on assisting author’s gender classification.


Text analysis sentiment analysis gender classification emotionbased features 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alm, C.O., Roth, D., Sproat, R.: Emotions from text: machine learning for text-based emotion prediction. In: Proceedings of HLT/EMNLP, Vancouver, Canada, pp. 579–586 (2005)Google Scholar
  2. 2.
    Argamon, S., Koppel, M., Fine, J., Shimoni, A.R.: Gender, genre, and writing style in formal written texts. Text - Interdisciplinary Journal for the Study of Discourse 23(3), 321–346 (2003), (accessed online on May 05, 2012)Google Scholar
  3. 3.
    Argamon, S., Kopel, M., Pennbaker, J.W., Schler, J.: Mining the Blogosphere: Age, Gender and the varieties of self-expression. First Monday 12(9), 3 (2007), (accessed online on May 01, 2012)
  4. 4.
    Bougie, R., Pieters, R., Zeelenberg, M.: Angry Customers Don’t Come Back, They Get Back: The Experience and Behavioral Implications of Anger and Dissatisfaction in Services. Journal of the Academy of Marketing Science 31(4), 377–393 (2003)CrossRefGoogle Scholar
  5. 5.
    Bradley, M.M., Lang, P.J.: Affective norms for English words (ANEW). Instruction Manual and Affective Ratings. Technical report, The Center for Research in Psychophysiology, University of Florida (1999)Google Scholar
  6. 6.
    Brody, L.R., Hall, J.A.: Gender and Emotion in Context. In: Lewis, M., Haviland-Jones, J.M., Barrett, L.F. (eds.) Handbook of Emotions, ch. 24, 3rd edn., pp. 395–408. The Guilford Press (2008)Google Scholar
  7. 7.
    Chaplin, T.M., Cole, P.M., Zahn-Waxler, C.: Parental Socialization of Emotion Expression: Gender Differences and Relations to Child Adjustment. Emotion 5(1), 80–88 (2005)CrossRefGoogle Scholar
  8. 8.
    Chaumartin, F.-R.: UPAR7: a knowledge-based system for headline sentiment tagging. In: Proceedings of the 4th International Workshop on Semantic Evaluations, pp. 422–425. ACL (2007)Google Scholar
  9. 9.
    Chung, C.K., Pennebaker, J.W.: Revealing dimensions of thinking in open-ended self-descriptors: An automated meaning extraction method for natural language. Journal of Research in Personality 42, 96–132 (2008)CrossRefGoogle Scholar
  10. 10.
    Corney, M., de Vel, O., Anderson, A., Mohay, G.: Gender-Preferential Text Mining of E-mail Discourse. In: Proceedings of the ACSAC 2002, pp. 282–289 (2002)Google Scholar
  11. 11.
    Ekman, P.: Facial Expression and Emotion. American Psychologist 8(4), 376–379 (1993)Google Scholar
  12. 12.
    Elliott, C.: The affective reasoner: A process model of emotions in a multi-agent system. Ph.D. thesis, Institute for the Learning Sciences, Northwestern University (1992)Google Scholar
  13. 13.
    Nico, H.: Emotional Behavior. In: The Emotions. Studies in Emotion and Social Interaction, ch. 2. Cambridge University Press (1986)Google Scholar
  14. 14.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)Google Scholar
  15. 15.
    Herring, S.C., Paolillo, J.C.: Gender and genre variation in weblogs. Journal of Sociolinguistics 10(4), 439–459 (2006)CrossRefGoogle Scholar
  16. 16.
    Hess, U., Adams Jr., R.B., Kleck, R.E.: Facial Appearance, Gender, and Emotion Expression. Emotion 4(4), 378–388 (2004)CrossRefGoogle Scholar
  17. 17.
    Heylighen, F., Dewaele, J.-M.: Variation in the Contextuality of Language: An Empirical Measure. Context in Context, Special issue of Foundations of Science 7(3), 293–340 (2002)Google Scholar
  18. 18.
    Juola, P.: Authorship Attribution. Foundations and Trends in Information Retrieval 1(3), 233–334 (2006)CrossRefGoogle Scholar
  19. 19.
    Kakkonen, T., Kakkonen, G.G.: SentiProfiler: Creating Comparable Visual Profiles of Sentimental Content in Texts. In: Proceedings of the LaTeCH Workshop, Associated with the RANLP 2011 Conference, Hissar, Bulgaria (2011)Google Scholar
  20. 20.
    Kim, S.-M., Hovy, E.: Determining the Sentiment of Opinions. In: Proceedings of the 20th International Conference on Computational Linguistics, Article 1367 (2004)Google Scholar
  21. 21.
    Koppel, M., Argamon, S., Shimoni, A.R.: Automatically Categorizing Written Text by Author Gender. Literary and Linguistic Computing 17(4), 401–412 (2002)CrossRefGoogle Scholar
  22. 22.
    Koppel, M., Schler, J., Argamon, S.: Computational methods in authorship attribution. Journal of the American Society for Information Science and Technology 60(1), 9–26 (2008)CrossRefGoogle Scholar
  23. 23.
    Leach, C.W., Tiedens, L.Z.: A World of Emotions. In: Tiedens, L.Z., Leach, C.W. (eds.) The Social Life of Emotions, pp. 1–16. Cambridge University Press (2004)Google Scholar
  24. 24.
    Liu, B.: Sentiment Analysis and Subjectivity. In: Indurkhya, N., Damerau, F.J. (eds.) Handbook of Natural Language Processing, 2nd edn. (2010)Google Scholar
  25. 25.
    Liu, H., Lieberman, H., Selker, T.: Automatic Affective Feedback in an Email Browser. MIT Media Laboratory Software Agents Group Technical Report (2002),
  26. 26.
    Liu, H., Lieberman, H., Selker, T.: A Model of Textual Affect Sensing using Real-World Knowledge. In: Proc. of the 2003 IUI, pp. 125–132 (2003)Google Scholar
  27. 27.
    McNeil, L.: Teaching an Old Genre a New Trick: The Diary on the Internet. Biography: An Interdisciplinary Quarterly 26(1), 24–48 (2003)CrossRefGoogle Scholar
  28. 28.
    Melville, P., Wojciech, G., Lawrence, R.D.: Sentiment analysis of blogs by combining lexical knowledge with text classification. In: Proceedings of the 15th ACM SIGKDD. ACM (2009)Google Scholar
  29. 29.
    Miller, G.A.: WordNet: A Lexical Database for English. Communications of the ACM 38(11), 39–41 (1995)CrossRefGoogle Scholar
  30. 30.
    Mohammad, S.M.: Portable Features for Classifying Emotional Text. In: Proceedings of the 2012 NAACL HLT, pp. 587–591 (2012)Google Scholar
  31. 31.
    Mohammad, S.M., Turney, P.D.: Emotions Evoked by Common Words and Phrases: Using Mechanical Turk to Create an Emotion Lexicon. In: Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, pp. 26–34 (2010)Google Scholar
  32. 32.
    Mohamad, S.M., Yang, T.(W.): Tracking Sentiment in mail: how genders differ on emotional axes. In: Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis, pp. 70–79. ACL, USA (2011)Google Scholar
  33. 33.
    Mukherjee, A., Liu, B.: Improving gender classification of blog authors. In: Proceedings of the 2010 EMNLP, pp. 207–217. ACL, USA (2010)Google Scholar
  34. 34.
    Mulac, A., Lundell, T.L.: Effect of Gender-Linked Language Differences in Adult’s Written Discourse: Multivariate Tests of Language Effects. Language and Communication 14(3), 299–309 (1994)CrossRefGoogle Scholar
  35. 35.
    Newman, M.L., Groom, C.J., Handelman, L.D., Pennebaker, J.W.: Gender Differences in Language Use: An Analysis of 14,000 Text Samples. Discourse Processes 45, 211–236 (2008)CrossRefGoogle Scholar
  36. 36.
    Nowson, S., Oberlander, J., Gill, A.J.: Weblogs, Genres and Individual Differences. In: Proceedings of The 27th Annual Conference of the Cognitive Science Society, pp. 1666–1671 (2005)Google Scholar
  37. 37.
    O’Kearney, R., Dadds, M.: Developmental and gender differences in the language for emotions across the adolescent years. Cognition and Emotion 18(7), 913–938 (2004)CrossRefGoogle Scholar
  38. 38.
    Ortony, A., Clore, G.L., Foss, M.A.: The Referential Structure of the Affective Lexicon. Cognitive Science 11, 341–364 (1987)CrossRefGoogle Scholar
  39. 39.
    Ortony, A., Clore, G.L., Collins, A.: The Structure of the Theory. In: The Cognitive Structure of Emotions, ch. 2, pp. 15–33. Cambridge University P. (1994)Google Scholar
  40. 40.
    Picard, D., Boulhais, M.: Sex differences in expressive drawing. Journal of Personality and Individual Differences 51, 850–855 (2011)CrossRefGoogle Scholar
  41. 41.
    Ashby Plant, E., Hyde, J.S., Keltner, D., Devine, P.G.: The Gender Stereotyping of Emotions. Psychology of Women Quarterly 24, 81–92 (2000), doi:10.1111/j.1471-6402.2000.tb01024.xCrossRefGoogle Scholar
  42. 42.
    Plutchik, R.: The Nature of Emotions. American Scientist 89(4), 344–350 (2001)CrossRefGoogle Scholar
  43. 43.
    Rangel, F., Rosso, P., Koppel, M., Stamatatos, E., Inches, G.: Overview of the Author Profiling Task at PAN 2013. In: Forner, P., Navigli, R., Tufis, D. (eds.) Notebook Papers of CLEF 2013 LABs and Workshops, CLEF (2013)Google Scholar
  44. 44.
    Shaver, P., Schwartz, J., Kirson, D., O’Connor, C.: Emotion Knowledge: Further Exploration of a Prototype Approach. In: Parrot, G.W. (ed.) Emotions in Social Psychology: Key Readings, pp. 26–56. Taylor & Francis, USA (2001)Google Scholar
  45. 45.
    Stone, P.J., Dunphy, D.C., Smith, M.S., Ogilvie, D.M.: The General Inquirer: A Computer Approach to Content Analysis. The MIT Press, Cambridge (1966)Google Scholar
  46. 46.
    Strapparava, C., Mihalcea, R.: Semeval-2007 task 14: Affective text. In: Proceedings of SemEval 2007, Prague, pp. 70–74 (2007)Google Scholar
  47. 47.
    Strapparava, C., Valitutti, A.: WordNet-Affect: an Affective Extension of WordNet. In: Proc. of the 4th LRE, pp. 1083–1086 (2004)Google Scholar
  48. 48.
    Strapparava, C., Mihalcea, R.: Learning to Identify Emotions in Text. In: Proc. of the ACM SAC 2008, pp. 1556–1560 (2008)Google Scholar
  49. 49.
    Thelwall, M., Bucley, K., Paltoglou, G., Cai, D.: Sentiment Strength Detection in Short Informal Text. Journal of The American Society for Information Science And Technology 61(12), 2544–2558 (2010)CrossRefGoogle Scholar
  50. 50.
    Thelwall, M., Wilkinson, D., Uppal, S.: Data mining emotion in social network communication: Gender differences in MySpace. Journal of the American Society for Information Science and Technology 61(1), 190–199 (2010)CrossRefGoogle Scholar
  51. 51.
    Thompson, R., Murachver, T.: Predicting gender from electronic discourse. British Journal of Social Psychology 40(2), 193–208 (2001)CrossRefGoogle Scholar
  52. 52.
    Valitutti, A., Strapparava, C., Stock, O.: Developing Affective Lexical Resources. PsychNology 2(1), 61–83 (2004)Google Scholar
  53. 53.
    Zhang, Y., Dang, Y., Chen, H.: Gender Classification for Web Forums. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans 41(4), 668–677 (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Calkin Suero Montero
    • 1
  • Myriam Munezero
    • 1
  • Tuomo Kakkonen
    • 1
  1. 1.School of ComputingUniversity of Eastern FinlandFinland

Personalised recommendations