Detecting Social Capitalists on Twitter Using Similarity Measures

  • Nicolas Dugué
  • Anthony Perez
Part of the Studies in Computational Intelligence book series (SCI, volume 476)


Social networks such as Twitter or Facebook are part of the phenomenon called Big Data, a term used to describe very large and complex data sets. To represent these networks, the connections between users can be easily represented using (directed) graphs. In this paper, we are mainly focused on two different aspects of social network analysis. First, our goal is to find an efficient and high-level way to store and process a social network graph, using reasonable computing resources (processor and memory).We believe that this is an important research interest, since it provides a more democratic method to deal with large graphs.Next, we turn our attention to the study of social capitalists, a specific kind of users on Twitter. Roughly speaking, such users try to gain visibility by following other users regardless of their contents. Using two similarity measures called overlap index and ratio, we show that such users may be detected and classified very efficiently.


Social Capitalist Similarity Measure Social Network Analysis Community Detection Large Graph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. of Stat. Mech.: Theory and Experiment 2008(10), 10,008 (2008)Google Scholar
  2. 2.
    Cha, M., Haddadi, H., Benevenuto, F., Gummadi, K.P.: Measuring User Influence in Twitter: The Million Follower Fallacy. In: ICWSM 2010: Proc. of int. AAAI Conference on Weblogs and Social (2010)Google Scholar
  3. 3.
    Ghosh, S., Viswanath, B., Kooti, F., Sharma, N.K., Korlam, G., Benevenuto, F., Ganguly, N., Gummadi, K.P.: Understanding and Combating Link Farming in the Twitter Social Network. In: Proc. of the 21st Int. Conference on World Wide Web, WWW 2012, pp. 61–70 (2012)Google Scholar
  4. 4.
    Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proc. of the 19th Int. Conference on World Wide Web, WWW 2010, pp. 591–600 (2010)Google Scholar
  5. 5.
    Lakshman, A., Malik, P.: Cassandra: a structured storage system on a p2p network. In: Proc. of the 28th ACM Symp. on Princ. of Distributed Comput., PODC 2009, p. 5 (2009)Google Scholar
  6. 6.
    Martínez-Bazan, N., Águila Lorente, M.A., Muntés-Mulero, V., Dominguez-Sal, D., Gómez-Villamor, S., Larriba-Pey, J.L.: Efficient Graph Management Based On Bitmap Indices. In: Proc. of the 16th Int. Database Eng. & Appl. Symp., IDEAS 2012, pp. 110–119 (2012)Google Scholar
  7. 7.
    OrientDB (1999),
  8. 8.
    Schatz, M.C., Langmead, B., Salzberg, S.L.: Cloud computing and the DNA data race. Nat. Biotech. 28(7), 691–693 (2010)CrossRefGoogle Scholar
  9. 9.
    Schuett, T., Pierre, G.: ConpaaS, an integrated cloud environment for big data. ERCIM News 2012(89) (2012)Google Scholar
  10. 10.
    Simpson, G.G.: Mammals and the nature of continents. Am. J. of Science (241), 1–41 (1943)Google Scholar
  11. 11.
    Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Zhang, N., Anthony, S., Liu, H., Murthy, R.: Hive - a petabyte scale data warehouse using hadoop. İn: IEEE 26th Int. Conference on Data Eng., pp. 996–1005 (2010)Google Scholar
  12. 12.
    Vicknair, C., Macias, M., Zhao, Z., Nan, X., Chen, Y., Wilkins, D.: A comparison of a graph database and a relational database: a data provenance perspective. In: Proc. of the 48th Annu. Southeast Reg. Conference, ACM SE, pp. 42:1–42:6 (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.LIFOUniversité d’OrléansOrléansFrance

Personalised recommendations