Abstract
The focus of the paper is to use a single weibo from a user to predict whether the user account is verified, referred to as verified account prediction, on Sina Weibo. To the best of our knowledge, verified account prediction on Sina Weibo has not been studied. For better understanding of the prediction problem, a comprehensive data analysis of weibos related to verified accounts is conducted first. Then, verified account prediction is formulated as a sequence learning problem. Specifically, a weibo from a user is represented as a sequence of feature values by feature hashing and whether the user account is verified is the corresponding label to predict. A deep learning approach is proposed for solving verified account prediction in this formulation. The proposed approach significantly outperforms the shallow learning methods in the comparisons in terms of accuracy and F1 by large margins in the experiments.
This is a preview of subscription content, access via your institution.















Data availibility
Enquiries about data availability should be directed to the authors.
Notes
Twitter’s and Sina Weibo’s verified account programs let people know that an account is authentic. People and companies can request to verify an account by submitting a request with supporting evidences such as a verified phone number or a confirmed email address. For example, common verified accounts include news agents, organizations and public figures.
References
Benevenuto F, Magno G, Rodrigues T, Almeida V (2010) Detecting spammers on twitter. In: Collaboration, electronic messaging, anti-abuse and spam conference (CEAS), vol 6, p 12
Campbell W, Baseman E, Greenfield K (2013) Content+ context networks for user classification in twitter. In: Frontiers of network analysis: methods, models, and applications workshop at neural information processing systems
Campbell W, Baseman E, Greenfield K (2014) Content+ context= classification: examining the roles of social interactions and linguist content in twitter user classification. In: Proceedings of the second workshop on natural language processing for social media (SocialNLP), pp 59–65
Castillo C, Mendoza M, Poblete B (2011) Information credibility on twitter. In: Proceedings of the 20th international conference on World wide web, ACM, pp 675–684
Cossu JV, Labatut V, Dugué N (2016) A review of features for the discrimination of twitter users: application to the prediction of offline influence. Soc Netw Anal Min 6(1):25
Dahl GE, Sainath TN, Hinton GE (2013) Improving deep neural networks for lvcsr using rectified linear units and dropout. In: Acoustics, speech and signal processing (ICASSP), 2013 IEEE international conference on, IEEE, pp 8609–8613
Friedman J, Hastie T, Tibshirani R (2001) The elements of statistical learning. Springer, Berlin
Fu Kw, Chan Ch, Chau M (2013) Assessing censorship on microblogs in china: discriminatory keyword analysis and the real-name registration policy. IEEE Internet Comput 17(3):42–50
Gers FA, Schmidhuber JA, Cummins FA (2000) Learning to forget: continual prediction with lstm. Neural Comput 12(10):2451–2471. https://doi.org/10.1162/089976600300015015
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw 18(5):602–610
Graves A, Liwicki M, Fernández S, Bertolami R, Bunke H, Schmidhuber J (2009) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31(5):855–868
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Lipton ZC, Berkowitz J, Elkan C (2015) A critical review of recurrent neural networks for sequence learning. arXiv preprint arXiv:1506.00019
Liu Z, Jansen BJ (2013) Factors influencing the response rate in social question and answering behavior. In: Proceedings of the 2013 conference on computer supported cooperative work, ACM, pp 1263–1274
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Pennacchiotti M, Popescu AM (2011) A machine learning approach to twitter user classification. Icwsm 11(1):281–288
Rao D, Yarowsky D, Shreevats A, Gupta M (2010) Classifying latent user attributes in twitter. In: Proceedings of the 2nd international workshop on search and mining user-generated contents, ACM, pp 37–44
Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Sun J (2012) Jieba. https://github.com/fxsjy/jieba
Wu K, Yang S, Zhu KQ (2015) False rumors detection on sina weibo by propagation structures. In: Data engineering (ICDE), 2015 IEEE 31st international conference on, IEEE, pp 651–662
Acknowledgements
We would like to thank the reviewers for their valuable comments and suggestions.
Funding
No funding was received for this research.
Author information
Authors and Affiliations
Contributions
The authors contributed to the study conception and design, material preparation, data collection and analysis, and manuscript preparation. The authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
All authors of this paper declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
A sample implementation of the proposed methods in Keras is shown below.

Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Monica Liu, S., Chen, JH. Verified account prediction on Sina Weibo with deep learning. Soft Comput 27, 3941–3954 (2023). https://doi.org/10.1007/s00500-022-07528-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-022-07528-4
Keywords
- Verified account prediction
- Sina Weibo
- Deep learning