Skip to main content

Prediction of Age, Sentiment, and Connectivity from Social Media Text

  • Conference paper
Web Information System Engineering – WISE 2011 (WISE 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6997))

Included in the following conference series:

Abstract

Social media corpora, including the textual output of blogs, forums, and messaging applications, provide fertile ground for linguistic analysis material diverse in topic and style, and at Web scale. We investigate manifest properties of textual messages, including latent topics, psycholinguistic features, and author mood, of a large corpus of blog posts, to analyze the impact of age, emotion, and social connectivity. These properties are found to be significantly different across the examined cohorts, which suggest discriminative features for a number of useful classification tasks. We build binary classifiers for old versus young bloggers, social versus solo bloggers, and happy versus sad posts with high performance. Analysis of discriminative features shows that age turns upon choice of topic, whereas sentiment orientation is evidenced by linguistic style. Good prediction is achieved for social connectivity using topic and linguistic features, leaving tagged mood a modest role in all classifications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)

    MATH  Google Scholar 

  2. Dewaele, J.M., Furnham, A.: Personality and speech production: a pilot study of second language learners. Personality and Individual Differences 28(2), 355–365 (2000)

    Article  Google Scholar 

  3. Dunbar, R.I.M.: Coevolution of neocortical size, group size and language in humans. Behavioral and Brain Sciences 16(4), 681–735 (1993)

    Article  Google Scholar 

  4. Freyd, M.: Introverts and extroverts. Psychological Review 31(1), 74–87 (1924)

    Article  Google Scholar 

  5. Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences 101(90001), 5228–5235 (2004)

    Article  Google Scholar 

  6. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009)

    Article  Google Scholar 

  7. Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Machine Learning 42(1), 177–196 (2001)

    Article  MATH  Google Scholar 

  8. Mairesse, F., Walker, M.A., Mehl, M.R., Moore, R.K.: Using linguistic cues for the automatic recognition of personality in conversation and text. Journal of Artificial Intelligence Research 30(1), 457–500 (2007)

    MATH  Google Scholar 

  9. Mihalcea, R., Liu, H.: A corpus-based approach to finding happiness. In: Proceedings of the AAAI Spring Symposium on Computational Approaches to Weblogs (2006)

    Google Scholar 

  10. Newman, M.L., Groom, C.J., Handelman, L.D., Pennebaker, J.W.: Gender differences in language use: An analysis of 14,000 text samples. Discourse Processes 45, 211–236 (2008)

    Article  Google Scholar 

  11. Nguyen, T., Phung, D., Adams, B., Venkatesh, S.: Towards discovery of influence and personality traits through social link prediction. In: Procs. of the Int. AAAI Conference on Weblogs and Social Media, ICWSM (2011)

    Google Scholar 

  12. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL 2002 Conference on Empirical Methods in Natural Language Processing, vol. 10, pp. 79–86. Association for Computational Linguistics (2002)

    Google Scholar 

  13. Pennebaker, J.W., Chung, C.K., Ireland, M., Gonzales, A., Booth, R.J.: The development and psychometric properties of LIWC 2007. LIWC Inc., Austin (2007)

    Google Scholar 

  14. Pennebaker, J.W., Francis, M.E., Booth, R.J.: Linguistic inquiry and word count (LIWC) [computer software]. LIWC Inc., Austin (2007)

    Google Scholar 

  15. Pennebaker, J.W., Stone, L.D.: Words of wisdom: Language use over the life span. Journal of Personality and Social Psychology 85(2), 291–301 (2003)

    Article  Google Scholar 

  16. Quintelier, E.: Differences in political participation between young and old people. Contemporary Politics 13(2), 165 (2007)

    Article  Google Scholar 

  17. Rude, S., Gortner, E.M., Pennebaker, J.: Language use of depressed and depression-vulnerable college students. Cognition & Emotion 18(8), 1121–1133 (2004)

    Article  Google Scholar 

  18. Schler, J., Koppel, M., Argamon, S., Pennebaker, J.: Effects of age and gender on blogging. In: 2006 AAAI Spring Symposium on Computational Approaches for Analyzing Weblogs (2006)

    Google Scholar 

  19. Slatcher, R.B., Chung, C.K., Pennebaker, J.W., Stone, L.D.: Winning words: Individual differences in linguistic style among us presidential and vice presidential candidates. Journal of Research in Personality 41(1), 63–75 (2007)

    Article  Google Scholar 

  20. Stirman, S.W., Pennebaker, J.W.: Word use in the poetry of suicidal and nonsuicidal poets. Psychosomatic Medicine 63(4), 517 (2001)

    Article  Google Scholar 

  21. Tausczik, Y.R., Pennebaker, J.W.: The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology 29(1), 24 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nguyen, T., Phung, D., Adams, B., Venkatesh, S. (2011). Prediction of Age, Sentiment, and Connectivity from Social Media Text. In: Bouguettaya, A., Hauswirth, M., Liu, L. (eds) Web Information System Engineering – WISE 2011. WISE 2011. Lecture Notes in Computer Science, vol 6997. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24434-6_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24434-6_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24433-9

  • Online ISBN: 978-3-642-24434-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics