Discover Your Social Identity from What You Tweet: A Content Based Approach

Huang, Binxuan; Carley, Kathleen M.

doi:10.1007/978-3-030-42699-6_2

Binxuan Huang¹⁷ &
Kathleen M. Carley¹⁷

Part of the book series: Lecture Notes in Social Networks ((LNSN))

8258 Accesses
10 Citations

Abstract

An identity denotes the role an individual or a group plays in highly differentiated contemporary societies. In this paper, our goal is to classify Twitter users based on their role identities. We first collect a coarse-grained public figure dataset automatically, then manually label a more fine-grained identity dataset. We propose a hierarchical self-attention neural network for Twitter user role identity classification. Our experiments demonstrate that the proposed model significantly outperforms multiple baselines. We further propose a transfer learning scheme that improves our model’s performance by a large margin. Such transfer learning also greatly reduces the need for a large amount of human labeled data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al.: Tensorflow: a system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pp. 265–283 (2016)
Google Scholar
Ashforth, B.E., Mael, F.: Social identity theory and the organization. Acad. Manage. Rev. 14(1), 20–39 (1989)
Article Google Scholar
Babcock, M., Beskow, D.M., Carley, K.M.: Beaten up on twitter? exploring fake news and satirical responses during the black panther movie event. In: International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, pp. 97–103. Springer (2018)
Google Scholar
Benigni, M., Joseph, K., Carley, K.M.: Mining online communities to inform strategic messaging: practical methods to identify community-level insights. Comput. Math. Organ. Theory 24(2), 224–242 (2018)
Article Google Scholar
Benigni, M.C., Joseph, K., Carley, K.M.: Bot-ivistm: assessing information manipulation in social media using network analytics. In: Emerging Research Challenges and Opportunities in Computational Social Network Analysis and Mining, pp. 19–42. Springer, Cham (2019)
Google Scholar
Beskow, D.M., Carley, K.M.: Bot conversations are different: leveraging network metrics for bot detection in twitter. In: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 825–832. IEEE (2018)
Google Scholar
Bessi, A., Ferrara, E.: Social bots distort the 2016 us presidential election online discussion. First Monday 21(11–17) (2016)
Google Scholar
Burger, J.D., Henderson, J., Kim, G., Zarrella, G.: Discriminating gender on twitter. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1301–1309. Association for Computational Linguistics (2011)
Google Scholar
Calhoun, C.J.: Social Theory and the Politics of Identity. Wiley-Blackwell, Oxford (1994)
Google Scholar
Callero, P.L.: Role-identity salience. Soc. Psychol. Q. 48(3), 203–215 (1985)
Article Google Scholar
Carley, K.M., Cervone, G., Agarwal, N., Liu, H.: Social cyber-security. In: International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, pp. 389–394. Springer (2018)
Google Scholar
Cha, M., Haddadi, H., Benevenuto, F., Gummadi, P.K., et al.: Measuring user influence in twitter: the million follower fallacy. Icwsm 10(10–17), 30 (2010)
Google Scholar
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
Google Scholar
Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Who is tweeting on twitter: human, bot, or cyborg? In: Proceedings of the 26th Annual Computer Security Applications Conference, pp. 21–30. ACM (2010)
Google Scholar
Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Detecting automation of twitter accounts: are you a human, bot, or cyborg? IEEE Trans. Dependable Secure Comput. 9(6), 811–824 (2012)
Article Google Scholar
Colleoni, E., Rozza, A., Arvidsson, A.: Echo chamber or public sphere? predicting political orientation and measuring political homophily in twitter using big data. J. Commun. 64(2), 317–332 (2014)
Article Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Google Scholar
Heise, D., MacKinnon, N.: Self, Identity, and Social Institutions. Palgrave Macmillan, New York (2010)
Google Scholar
Hentschel, M., Alonso, O., Counts, S., Kandylas, V.: Finding users we trust: scaling up verified twitter users using their communication patterns. In: ICWSM (2014)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Hu, T., Xiao, H., Luo, J., Nguyen, T.V.T.: What the language you tweet says about your occupation. In: Tenth International AAAI Conference on Web and Social Media (2016)
Google Scholar
Huang, B., Carley, K.M.: On predicting geolocation of tweets using convolutional neural networks. In: International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, pp. 281–291. Springer (2017)
Google Scholar
Huang, B., Carley, K.: A hierarchical location prediction neural network for twitter user geolocation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 4731–4741. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/D19-1480
Jenkins, R.: Social identity. Routledge, London (2014)
Book Google Scholar
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Google Scholar
National Academies of Sciences, Engineering, and Medicine (U.S.).: A Decadal Survey of the Social and Behavioral Sciences: A Research Agenda for Advancing Intelligence Analysis. The National Academies Press, Washington (2019). https://doi.org/10.17226/25335
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018)
Google Scholar
Preoţiuc-Pietro, D., Lampos, V., Aletras, N.: An analysis of the user occupational class through twitter content. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), vol. 1, pp. 1754–1764 (2015)
Google Scholar
Priante, A., Hiemstra, D., van den Broek, T., Saeed, A., Ehrenhard, M., Need, A.: # whoami in 160 characters? classifying social identities based on twitter profile descriptions. In: Proceedings of the First Workshop on NLP and Computational Social Science, pp. 55–65 (2016)
Google Scholar
Ramon Villa Cox, M.B., Carley, K.M.: Pretending positive, pushing false: comparing captain marvel misinformation campaigns. Fake News, Disinformation, and Misinformation in Social Media-Emerging Research Challenges and Opportunities (2019)
Google Scholar
Rangel Pardo, F.M., Celli, F., Rosso, P., Potthast, M., Stein, B., Daelemans, W.: Overview of the 3rd author profiling task at pan 2015. In: CLEF 2015 Evaluation Labs and Workshop Working Notes Papers, pp. 1–8 (2015)
Google Scholar
Recasens, M., Hovy, E., Martí, M.A.: Identity, non-identity, and near-identity: addressing the complexity of coreference. Lingua 121(6), 1138–1152 (2011)
Article Google Scholar
Robinson, L.: The cyberself: the self-ing project goes online, symbolic interaction in the digital age. New Media Soc. 9(1), 93–110 (2007)
Article Google Scholar
Smith-Lovin, L.: The strength of weak identities: social structural sources of self, situation and emotional experience. Soc. Psychol. Q. 70(2), 106–124 (2007)
Article Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Stryker, S., Burke, P.J.: The past, present, and future of an identity theory. Soc. Psychol. Q. 63(4), 284–297 (2000)
Article Google Scholar
Tajfel, H.: Social identity and intergroup behaviour. Information (International Social Science Council) 13(2), 65–93 (1974)
Google Scholar
Tajfel, H.: Social Identity and Intergroup Relations. Cambridge University Press, Cambridge (1982)
Google Scholar
Uyheng, J., Carley, K.M.: Characterizing bot networks on twitter: an empirical analysis of contentious issues in the asia-pacific. In: International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, pp. 153–162. Springer (2019)
Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Zhang, Y., Wei, W., Huang, B., Carley, K.M., Zhang, Y.: Rate: overcoming noise and sparsity of textual features in real-time location estimation. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 2423–2426. ACM (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Carnegie Mellon University, Pittsburgh, PA, USA
Binxuan Huang & Kathleen M. Carley

Authors

Binxuan Huang
View author publications
You can also search for this author in PubMed Google Scholar
Kathleen M. Carley
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Binxuan Huang .

Editor information

Editors and Affiliations

Computer Science & Engineering, Arizona State University, Tempe, AZ, USA
Kai Shu
College of Information Sciences and Technology, The Pennsylvania State University, University Park, PA, USA
Suhang Wang
College of Information Sciences and Technology, The Pennsylvania State University, University Park, PA, USA
Dongwon Lee
Computer Science & Engineering, Arizona State University, Tempe, AZ, USA
Huan Liu

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Huang, B., Carley, K.M. (2020). Discover Your Social Identity from What You Tweet: A Content Based Approach. In: Shu, K., Wang, S., Lee, D., Liu, H. (eds) Disinformation, Misinformation, and Fake News in Social Media. Lecture Notes in Social Networks. Springer, Cham. https://doi.org/10.1007/978-3-030-42699-6_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-42699-6_2
Published: 18 June 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-42698-9
Online ISBN: 978-3-030-42699-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics