Person, Organization, or Personage: Towards User Account Type Prediction in Microblogs

  • Ivan Samborskii
  • Andrey FilchenkovEmail author
  • Georgiy Korneev
  • Alex Farseev
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 947)


During the past decade, microblog services have been extensively utilized by millions of business and private users as one of the most powerful information broadcasting tools. For example, Twitter attracted many social science researchers due to its high popularity, constrained format of thought expression, and the ability to react actual trends. However, unstructured data from microblogs often suffer from the lack of representativeness due to the tremendous amount of noise. Such noise is often introduced by the activity of organizational and fake user ac-counts that may not be useful in many application domains. Aiming to tackle the information filtering problem, in this paper, we classify Twitter accounts into three categories: “Personal”, “Organization”, and “Personage”. Specifically, we utilize various text-based data representation approaches to extract features for our proposed microblog account type prediction framework “POP-MAP”. To study the problem at a cross-language level, we harvested and learned from a multi-lingual Twitter dataset, which allows us to achieve better classification performance, as compared to various state-of-the-art baselines.


Twitter Social media Profile learning Natural language processing Account type classification 


  1. 1.
    Aramaki, E., Maskawa, S., Morita, M.: Twitter catches the flu: detecting influenza epidemics using twitter. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1568–1576. Association for Computational Linguistics (2011)Google Scholar
  2. 2.
    Barone, L.: Which type of twitter account should you create? (2010). Accessed 15 Apr 2016
  3. 3.
    Bartunov, S., Korshunov, A., Park, S.-T., Ryu, W., Lee, H.: Joint link-attribute user identity resolution in online social networks. In: Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining, Workshop on Social Network Mining and Analysis. ACM (2012)Google Scholar
  4. 4.
    Boshmaf, Y., Muslukhov, I., Beznosov, K., Ripeanu, M.: Design and analysis of a social botnet. Comput. Netw. 57(2), 556–578 (2013)CrossRefGoogle Scholar
  5. 5.
    Cao, Q., Sirivianos, M., Yang, X., Pregueiro, T.: Aiding the detection of fake accounts in large scale social online services. In: Presented as Part of the 9th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2012, pp. 197–210 (2012)Google Scholar
  6. 6.
    Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Who is tweeting on Twitter: human, bot, or cyborg? In: Proceedings of the 26th Annual Computer Security Applications Conference, pp. 21–30. ACM (2010)Google Scholar
  7. 7.
    Culotta, A.: Towards detecting influenza epidemics by analyzing twitter messages. In: Proceedings of the First Workshop on Social Media Analytics, pp. 115–122. ACM (2010)Google Scholar
  8. 8.
    Deitrick, W., Miller, Z., Valyou, B., Dickinson, B., Munson, T., Wei, H.: Gender identification on twitter using the modified balanced winnow. Commun. Netw. 4(3), 1–7 (2012)Google Scholar
  9. 9.
    Farseev, A., Akbari, M., Samborskii, I., Chua, T.-S.: 360° user profiling: past, future, and applications. ACM SIGWEB Newslett, (Summer), Article no. 4 (2016)Google Scholar
  10. 10.
    Farseev, A., Chua, T.-S.: TweetFit: fusing sensors and multiple social media for wellness profile learning. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. AAAI (2017)Google Scholar
  11. 11.
    Farseev, A., Kotkov, D., Semenov, A., Veijalainen, J., Chua, T.-S.: Cross-social network collaborative recommendation. In: Proceedings of the ACM Web Science Conference, p. 38. ACM (2015)Google Scholar
  12. 12.
    Farseev, A., Nie, L., Akbari, M., Chua, T.-S.: Harvesting multiple sources for user profile learning: a big data study. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp. 235–242. ACM (2015)Google Scholar
  13. 13.
    Farseev, A., Samborskii, I., Chua, T.-S.: bBridge: a big data platform for social multimedia analytics. In: Proceedings of the 2016 ACM Conference on Multimedia, pp. 759–761. ACM (2016)Google Scholar
  14. 14.
    Filchenkov, A.A., Azarov, A.A., Abramov, M.V.: What is more predictable in social media: election outcome or protest action? In: Proceedings of the 2014 Conference on Electronic Governance and Open Society: Challenges in Eurasia, pp. 157–161. ACM (2014)Google Scholar
  15. 15.
    Hendler, J., Shadbolt, N., Hall, W., Berners-Lee, T., Weitzner, D.: Web science: an interdisciplinary approach to understanding the web. Commun. ACM 51(7), 60–69 (2008)CrossRefGoogle Scholar
  16. 16.
    Kafeza, E., Kanavos, A., Makris, C., Vikatos, P.: T-PICE: Twitter personality based influential communities extraction system. In: 2014 IEEE International Congress on Big Data, pp. 212–219. IEEE (2014)Google Scholar
  17. 17.
    Lee, K., Agrawal, A., Choudhary, A.: Real-time disease surveillance using twitter data: demonstration on flu and cancer. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1474–1477. ACM (2013)Google Scholar
  18. 18.
    Lin, J.: Automatic author profiling of online chat logs. Ph.D. thesis, Monterey, California. Naval Postgraduate School (2007)Google Scholar
  19. 19.
    Lin, J., Sugiyama, K., Kan, M.-T., Chua, T.-S.: Addressing cold-start in app recommendation: latent user models constructed from twitter followers. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 283–292. ACM (2013)Google Scholar
  20. 20.
    Oentaryo, R.J., Low, J.-W., Lim, E.-P.: Chalk and Cheese in twitter: discriminating personal and organization accounts. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 465–476. Springer, Cham (2015). Scholar
  21. 21.
    Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)CrossRefGoogle Scholar
  22. 22.
    Schwartz, H.A., et al.: Personality, gender, and age in the language of social media: the open-vocabulary approach. PLoS One 8(9), e73791 (2013)CrossRefGoogle Scholar
  23. 23.
    Tavares, G., Faisal, A.: Scaling-laws of human broadcast communication enable distinction between human, corporate and robot twitter users. PLoS One 8(7), e65774 (2013)CrossRefGoogle Scholar
  24. 24.
    Tsakalidis, A., Papadopoulos, S., Cristea, A.I., Kompatsiaris, Y.: Predicting elections for multiple countries using twitter and polls. IEEE Intell. Syst. 30(2), 10–17 (2015)CrossRefGoogle Scholar
  25. 25.
    Varlamov, M.I., Turdakov, D.Y.: A survey of methods for the extraction of information from web resources. Program. Comput. Softw. 42(5), 279–291 (2016)CrossRefGoogle Scholar
  26. 26.
    Wang, A.H.: Detecting spam bots in online social networking sites: a machine learning approach. In: Foresti, S., Jajodia, S. (eds.) DBSec 2010. LNCS, vol. 6166, pp. 335–342. Springer, Heidelberg (2010). Scholar
  27. 27.
    Wang, G., Song, Q., Sun, H., Zhang, X., Xu, B., Zhou, Y.: A feature subset selection algorithm automatic recommendation method. J. Artif. Intell. Res. 47, 1–34 (2013)CrossRefGoogle Scholar
  28. 28.
    Zhao, W.X., et al.: Comparing twitter and traditional media using topic models. In: Clough, P., et al. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 338–349. Springer, Heidelberg (2011). Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Ivan Samborskii
    • 1
    • 2
  • Andrey Filchenkov
    • 1
    Email author
  • Georgiy Korneev
    • 1
  • Alex Farseev
    • 1
    • 3
  1. 1.ITMO UniversitySt. PetersburgRussia
  2. 2.National University of SingaporeSingaporeSingapore
  3. 3.SoMin ResearchSingaporeSingapore

Personalised recommendations