Abstract
Twitter, one of the most popular social media platforms, has been studied from different angles. One of the important sources of information in Twitter is users’ biographies, which are short self-introductions written by users in free form. Biographies often describe users’ background and interests. However, to the best of our knowledge, there has not been much work trying to extract information from Twitter biographies. In this work, we study how to extract information revealing users’ personal interests from Twitter biographies. A sequential labeling model is trained with automatically constructed labeled data. The popular patterns expressing user interests are extracted and analyzed. We also study the connection between interest tags extracted from user biographies and tweet content, and find that there is a weak linkage between them, suggesting that biographies can potentially serve as a complimentary source of information to tweets.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Benson, E., Haghighi, A., Barzilay, R.: Event discovery in social media feeds. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, pp. 389–398. Association for Computational Linguistics, Stroudsburg (2011)
Burger, J.D., Henderson, J., Kim, G., Zarrella, G.: Discriminating gender on Twitter. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1301–1309. Association for Computational Linguistics, Stroudsburg (2011)
Counts, S., Stecher, K.B.: Self-presentation of personality during online profile creation. In: Proceedings of International AAAI Conference on Weblogs and Social Media, pp. 191–194. The AAAI Press, Dublin (2012)
Dong, W., Qiu, M., Zhu, F.: Who am i on Twitter?: A cross-country comparison. In: Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion, pp. 253–254 (2014)
Filippova, K.: User demographics and language in an implicit social network. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1478–1488. Association for Computational Linguistics, Stroudsburg (2012)
Gimpel, K., Schneider, N., O’Connor, B., Das, D., Mills, D., Eisenstein, J., Heilman, M., Yogatama, D., Flanigan, J., Smith, N.A.: Part-of-speech tagging for Twitter: annotation, features, and experiments. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers - Volume 2, pp. 42–47. Association for Computational Linguistics, Stroudsburg (2011)
Hagger-Johnson, G., Egan, V., Stillwell, D.: Are social networking profiles reliable indicators of sensational interests? Journal of Research in Personality 45(1), 71–76 (2011)
Imran, M., Elbassuoni, S., Castillo, C., Diaz, F., Meier, P.: Practical extraction of disaster-relevant information from social media. In: Proceedings of the 22nd International Conference on World Wide Web Companion, pp. 1021–1024. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland (2013)
Lampe, C.A., Ellison, N., Steinfield, C.: A familiar face(book): Profile elements as signals in an online social network. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 435–444. ACM, New York (2007)
Liu, Z., Chen, X., Sun, M.: Mining the interests of Chinese microbloggers via keyword extraction. Frontiers of Computer Science, 76–87 (2012)
Marwick, A.E., et al.: I tweet honestly, I tweet passionately: Twitter users, context collapse, and the imagined audience. New Media & Society 13(1), 114–133 (2011)
Mukherjee, A., Liu, B.: Improving gender classification of blog authors. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 207–217. Association for Computational Linguistics, Stroudsburg (2010)
Nguyen, D., Smith, N.A., Rosé, C.P.: Author age prediction from text using linear regression. In: Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, pp. 115–123. Association for Computational Linguistics, Stroudsburg (2011)
Nosko, A., Wood, E., Zivcakova, L., Molema, S., De Pasquale, D., Archer, K.: Disclosure and use of privacy settings in Facebook profiles: evaluating the impact of media context and gender. Social Networking, 1–8 (2013)
Pennacchiotti, M., Popescu, A.M.: Democrats, republicans and Starbucks afficionados: user classification in Twitter. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 430–438. ACM, New York (2011)
Qiu, L., Lin, H., Leung, A.K.Y.: Cultural differences and switching of in-group sharing behavior between an American (Facebook) and a Chinese (Renren) social networking site. Journal of Cross-Cultural Psychology 44(1), 106–121 (2013)
Ritter, A., Clark, S., Mausam, Etzioni, O.: Named entity recognition in tweets: an experimental study. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1524–1534. Association for Computational Linguistics, Stroudsburg (2011)
Ritter, A., Mausam, E.O., Clark, S.: Open domain event extraction from Twitter. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1104–1112. ACM, New York (2012)
Rosenthal, S., McKeown, K.: Age prediction in blogs: A study of style, content, and online behavior in pre- and post-social media generations. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 763–772. Association for Computational Linguistics, Stroudsburg (2011)
Roshchina, A., Cardiff, J., Rosso, P.: User profile construction in the twin personality-based recommender system. In: Proceedings of the Workshop on Sentiment Analysis where AI meets Psychology, pp. 73–79. Asian Federation of Natural Language Processing, Chiang Mai (2011)
Wu, W., Zhang, B., Ostendorf, M.: Automatic generation of personalized annotation tags for twitter users. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 689–692. Association for Computational Linguistics, Stroudsburg (2010)
Yang, H., Li, Y.: Identifying user needs from social media. IBM Research report (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Ding, Y., Jiang, J. (2014). Extracting Interest Tags from Twitter User Biographies. In: Jaafar, A., et al. Information Retrieval Technology. AIRS 2014. Lecture Notes in Computer Science, vol 8870. Springer, Cham. https://doi.org/10.1007/978-3-319-12844-3_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-12844-3_23
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12843-6
Online ISBN: 978-3-319-12844-3
eBook Packages: Computer ScienceComputer Science (R0)