Multi-aspect Entity-Centric Analysis of Big Social Media Archives

  • Pavlos Fafalios
  • Vasileios Iosifidis
  • Kostas Stefanidis
  • Eirini Ntoutsi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10450)


Social media archives serve as important historical information sources, and thus meaningful analysis and exploration methods are of immense value for historians, sociologists and other interested parties. In this paper, we propose an entity-centric approach to analyze social media archives and we define measures that allow studying how entities are reflected in social media in different time periods and under different aspects (like popularity, attitude, controversiality, and connectedness with other entities). A case study using a large Twitter archive of 4 years illustrates the insights that can be gained by such an entity-centric multi-aspect analysis.


  1. 1.
    Amigó, E., Carrillo-de-Albornoz, J., Chugur, I., Corujo, A., Gonzalo, J., Meij, E., Rijke, M., Spina, D.: Overview of RepLab 2014: author profiling and reputation dimensions for online reputation management. In: Kanoulas, E., Lupu, M., Clough, P., Sanderson, M., Hall, M., Hanbury, A., Toms, E. (eds.) CLEF 2014. LNCS, vol. 8685, pp. 307–322. Springer, Cham (2014). doi:10.1007/978-3-319-11382-1_24 Google Scholar
  2. 2.
    Ardon, S., Bagchi, A., Mahanti, A., Ruhela, A., Seth, A., Tripathy, R.M., Triukose, S.: Spatio-temporal analysis of topic popularity in Twitter. arXiv preprint arXiv:1111.2904 (2011)
  3. 3.
    Atefeh, F., Khreich, W.: A survey of techniques for event detection in twitter. Computat. Intell. 31(1) (2015)Google Scholar
  4. 4.
    Batrinca, B., Treleaven, P.C.: Social media analytics: a survey of techniques, tools and platforms. AI & Society 30(1) (2015)Google Scholar
  5. 5.
    Blanco, R., Ottaviano, G., Meij, E.: Fast and space-efficient entity linking for queries. In: WSDM (2015)Google Scholar
  6. 6.
    Bruns, A., Stieglitz, S.: Towards more systematic Twitter analysis: metrics for tweeting activities. Internat. J. Soc. Res. Method. 16(2) (2013)Google Scholar
  7. 7.
    Bruns, A., Weller, K.: Twitter as a first draft of the present: and the challenges of preserving it for the future. In: 8th ACM Conference on Web Science (2016)Google Scholar
  8. 8.
    Celik, I., Abel, F., Houben, G.-J.: Learning semantic relationships between entities in Twitter. In: Auer, S., Díaz, O., Papadopoulos, G.A. (eds.) ICWE 2011. LNCS, vol. 6757, pp. 167–181. Springer, Heidelberg (2011). doi:10.1007/978-3-642-22233-7_12 CrossRefGoogle Scholar
  9. 9.
    Christophides, V., Efthymiou, V., Stefanidis, K.: Entity Resolution in the Web of Data. Synthesis Lectures on the SemanticWeb: Theory and Technology. Morgan & Claypool Publishers, San Rafael (2015)Google Scholar
  10. 10.
    Ferragina, P., Scaiella, U.: Tagme: on-the-fly annotation of short text fragments (by Wikipedia entities). In: CIKM (2010)Google Scholar
  11. 11.
    Guille, A., Hacid, H., Favre, C., Zighed, D.A.: Information diffusion in online social networks: a survey. ACM SIGMOD Record 42(2) (2013)Google Scholar
  12. 12.
    Kucuktunc, O., Cambazoglu, B.B., Weber, I., Ferhatosmanoglu, H.: A large-scale sentiment analysis for Yahoo! answers. In: WSDM (2012)Google Scholar
  13. 13.
    Meng, X., Wei, F., Liu, X., Zhou, M., Li, S., Wang, H.: Entity-centric topic-oriented opinion summarization in Twitter. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2012)Google Scholar
  14. 14.
    Moro, A., Raganato, A., Navigli, R.: Entity linking meets word sense disambiguation: a unified approach. Trans. Assoc. Computat. Linguist. 2 (2014)Google Scholar
  15. 15.
    Pang, B., Lee, L., et al.: Opinion mining and sentiment analysis. Found. Trends® Inf. Retrieval 2(1–2) (2008)Google Scholar
  16. 16.
    Qazvinian, V., Rosengren, E., Radev, D.R., Mei, Q.: Rumor has it: Identifying misinformation in microblogs. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2011)Google Scholar
  17. 17.
    Roussakis, Y., Chrysakis, I., Stefanidis, K., Flouris, G., Stavrakas, Y.: A flexible framework for understanding the dynamics of evolving RDF datasets. In: Arenas, M., Corcho, O., Simperl, E., Strohmaier, M., d’Aquin, M., Srinivas, K., Groth, P., Dumontier, M., Heflin, J., Thirunarayan, K., Staab, S. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 495–512. Springer, Cham (2015). doi:10.1007/978-3-319-25007-6_29 CrossRefGoogle Scholar
  18. 18.
    Saleiro, P., Soares, C.: Learning from the news: predicting entity popularity on Twitter. In: Boström, H., Knobbe, A., Soares, C., Papapetrou, P. (eds.) IDA 2016. LNCS, vol. 9897, pp. 171–182. Springer, Cham (2016). doi:10.1007/978-3-319-46349-0_15 CrossRefGoogle Scholar
  19. 19.
    Sedhai, S., Sun, A.: Hspam14: A collection of 14 million tweets for hashtag-oriented spam research. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (2015)Google Scholar
  20. 20.
    Stefanidis, K., Koloniari, G.: Enabling social search in time through graphs. In: Web-KR@CIKM (2014)Google Scholar
  21. 21.
    Thelwall, M., Buckley, K., Paltoglou, G.: Sentiment strength detection for the social web. J. Am. Soc. Inform. Sci. Technol. 63(1), 163–173 (2012)CrossRefGoogle Scholar
  22. 22.
    Weikum, G., Spaniol, M., Ntarmos, N., Triantafillou, P., Benczúr, A., Kirkpatrick, S., Rigaux, P., Williamson, M.: Longitudinal analytics on web archive data: it’s about time! In: CIDR (2011)Google Scholar
  23. 23.
    Yao, J.-G., Fan, F., Zhao, W.X., Wan, X., Chang, E., Xiao, J.: Tweet timeline generation with determinantal point processes. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. AAAI Press (2016)Google Scholar
  24. 24.
    Zhao, X.W., Guo, Y., Yan, R., He, Y., Li, X.: Timeline generation with social attention. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM (2013)Google Scholar
  25. 25.
    Zimmer, M.: The Twitter Archive at the Library of Congress: Challenges for information practice and information policy. First Monday 20(7) (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Pavlos Fafalios
    • 1
  • Vasileios Iosifidis
    • 1
  • Kostas Stefanidis
    • 2
  • Eirini Ntoutsi
    • 1
  1. 1.L3S Research CenterUniversity of HannoverHanoverGermany
  2. 2.Faculty of Natural SciencesUniversity of TampereTampereFinland

Personalised recommendations