Knowledge and Information Systems

, Volume 33, Issue 3, pp 523–547 | Cite as

Analyzing collective behavior from blogs using swarm intelligence

  • Soumya BanerjeeEmail author
  • Nitin Agarwal
Regular Paper


With the rapid growth of the availability and popularity of interpersonal and behavior-rich resources such as blogs and other social media avenues, emerging opportunities and challenges arise as people now can, and do, actively use computational intelligence to seek out and understand the opinions of others. The study of collective behavior of individuals has implications to business intelligence, predictive analytics, customer relationship management, and examining online collective action as manifested by various flash mobs, the Arab Spring (2011) and other such events. In this article, we introduce a nature-inspired theory to model collective behavior from the observed data on blogs using swarm intelligence, where the goal is to accurately model and predict the future behavior of a large population after observing their interactions during a training phase. Specifically, an ant colony optimization model is trained with behavioral trend from the blog data and is tested over real-world blogs. Promising results were obtained in trend prediction using ant colony based pheromone classier and CHI statistical measure. We provide empirical guidelines for selecting suitable parameters for the model, conclude with interesting observations, and envision future research directions.


Social network Blog Collective behavior Sentiment analysis Ant colony Swarm intelligence Supervised learning Trend prediction 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Adamic LA, Glance N (2005) The political blogosphere and the 2004. US election: divided they blog. In: LinkKDD ’05: proceedings of the 3rd international workshop on link discovery, pp 36–43Google Scholar
  2. 2.
    Adar E, Adamic LA (2005) Tracking information epidemics in blogspaceGoogle Scholar
  3. 3.
    Agarwal N, Liu H (2008) Blogosphere: research issues, tools, and applications. SIGKDD Explor 10(1): 18–31CrossRefGoogle Scholar
  4. 4.
    Agarwal N, Galan M, Liu H, Subramanya S (2010) WisColl: collective wisdom based blog clustering. J Inform Sci 180(1): 39–61CrossRefGoogle Scholar
  5. 5.
    Agarwal N, Lim M, Wigand RT (2011) Finding her master’s voice: the power of collective action among female muslim bloggers. In: The 19th European conference on information systems (ECIS2011). June 9–11, Helsinki, FinlandGoogle Scholar
  6. 6.
    Agarwal N, Liu H, Tang L, Yu SP (2012) Modeling blogger influence in a community social network analysis and mining. Springer, New YorkGoogle Scholar
  7. 7.
    Battelle J (2005) The search: how Google and its rivals rewrote the rules of business and transformed our culture. Portfolio, New YorkGoogle Scholar
  8. 8.
    Bonabeau E, Dorigo M, Theraulaz G (1999) Swarm intelligence: from natural to artificial systems, Santa Fe institute in the sciences of complexity. Oxford University Press, New YorkGoogle Scholar
  9. 9.
    Bortree DS (2005) Presentation of self on the web: an ethnographic study of teenage girls’ weblogs. Educ Commun Inform 5(1): 25–39CrossRefGoogle Scholar
  10. 10.
    Buckingham, D, Willett, R (eds) (2006) Digital generations: children, young people, and the new media. Routledge, CambridgeGoogle Scholar
  11. 11.
    Bullnheimer B, Hartl RF, Strauss C (1999) An improved ant system algorithm for the vehicle routing problem. Ann Oper Res 89: 319–328MathSciNetzbMATHCrossRefGoogle Scholar
  12. 12.
    Cao L (2010) In-depth behavior understanding and use: the behavior informatics approach. Inform Sci 180(17): 3067–3085CrossRefGoogle Scholar
  13. 13.
    Cao, L, Yu, SP (eds) (2012) Behavior computing. Springer, New YorkGoogle Scholar
  14. 14.
    Cao L, Zhao Y, Figueiredo F, Ou Y, Luo D (2007) Mining high impact exceptional behavior patterns. In: Industry track with 2007 Pacific-Asia conference on 864. Knowledge Discovery and Data Mining, LNCS4819, pp. 56–63Google Scholar
  15. 15.
    Costa D, Hertz A (1997) Ants can colour graphs. J Oper Res Soc 48: 295–305zbMATHGoogle Scholar
  16. 16.
    Costa LF, Rodrigues FA, Travieso G, Boas PRV (2007) Characterization of complex networks: a survey of measurements. Adv Phys 56: 167–242CrossRefGoogle Scholar
  17. 17.
    Das S, Abraham A (2010) Pattern clustering using a swarm intelligence approach. In: Maimon O, Rokach L (eds) Data mining and knowledge discovery handbook. Springer, Germany. ISBN 978-0-387-09822-7, pp 469–504Google Scholar
  18. 18.
    Dorigo M, Blum C (2005) Ant colony optimization theory: a survey. Theoretical 733 computer science, pp 243–278Google Scholar
  19. 19.
    Dorigo M., Di Caro G., Gambardella LM. (1999) Ant algorithms for discrete optimization. Artif Life 5(2): 137–172CrossRefGoogle Scholar
  20. 20.
    Embrey TR (2002) You blog, we blog: a guide to how teacher-librarians can use weblogs to build communication and research skills. Teach Libr 30(2): 7–9Google Scholar
  21. 21.
    Fan T-K, Chang C-H (2010) Sentiment-oriented contextual advertising. Knowl Inform Syst 23(3): 321–344. Springer, LondonGoogle Scholar
  22. 22.
    Feng S, Wang D, Yu G, Gao W, Wong K-F (2011) Extracting common emotions from blogs based on fine-grained sentiment clustering. Knowl Inform Syst 27(2):281–302. Springer, LondonGoogle Scholar
  23. 23.
    GamBardella E., Tailard E., Dorigo M. (1999) Ant colonies for the quadratic assignment problem. J Oper Res Soc 50: 167–176zbMATHGoogle Scholar
  24. 24.
    Ghose A, Ipeirotis PG (2007) Designing novel review ranking systems: predicting usefulness and impact of reviews. In: Proceedings of the international conference on electronic commerce (ICEC), Invited paperGoogle Scholar
  25. 25.
    Glance N, Hurst M, Nigam K, Siegler M, Stockton R, Tomokiyo T (2005) Deriving marketing intelligence from online discussion. Conference on knowledge discovery in dataGoogle Scholar
  26. 26.
    Goldenberg J., Libai B., Muller E. (2001) Talk of the network: a complex systems look at the underlying process of word-of-mouth. Mark Lett 12(3): 211–213CrossRefGoogle Scholar
  27. 27.
    Goli M, Rouhani Rankoohi S (2011) A new vertical fragmentation algorithm based on ant collective behavior in distributed database systems. Knowl Inform Syst. Springer, London. Published online on February 24, doi: 10.1007/s10115-011-0384-6
  28. 28.
    Gruhl D, Guha R, Kumar R, Novak J, Tomkins A (2005) The predictive power of online chatter. Conference on knowledge discovery in dataGoogle Scholar
  29. 29.
    Herring SC, Kouper I, Paolillo JC, Scheidt LA, Tyworth M, Welsch P, Wright E, Yu N (2005) Conversations in the blogosphere: an analysis “from the bottom up”. In: Proceedings of the thirty-eighth Hawaii international conference on system sciences (HICSS-38). IEEE Press, Los Alamitos, pp 1–11Google Scholar
  30. 30.
    Horrigan JA (2008) Online shopping. Pew internet and American life project reportGoogle Scholar
  31. 31.
    Jou S-H, Kao S-J (2002) Agent-based infrastructure and an application to internet information gathering. Knowl Inform Syst 4(1):80–95. Springer, LondonGoogle Scholar
  32. 32.
    Karimi Adl R, Rouhani Rankoohi S (2009) A new ant colony optimization based algorithm for data allocation problem in distributed databases. Knowl Inform Syst 20(3):349–373. Springer, LondonGoogle Scholar
  33. 33.
    Kelsey Group (2007) Online consumer-generated reviews have significant Impact on offline purchase behavior, Press Release, November 2007Google Scholar
  34. 34.
    Kim S-M, Pantel P, Chklovski T, Pennacchiotti M (2006) Automatically assessing review helpfulness. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), Sydney, Australia, pp 423–430, July 2006Google Scholar
  35. 35.
    Kumar R, Novak J, Raghavan P, Tomkins A (2003) On the bursty evolution of blogspace. In: WWW ’03, pp 568–576. ACM Press, New YorkGoogle Scholar
  36. 36.
    Leskovec J, McGlohon M, Faloutsos C, Glance N, Hurst M (2006) Cascading behavior in large blog graphs: patterns and a model. Technical report CMU-ML-06-113, October 2006Google Scholar
  37. 37.
    Leskovec J, Adamic LA, Huberman BA (2006) The dynamics of viral marketing. In: EC ’06: proceedings of the 7th ACM conference on electronic commerce. ACM Press, New York, NY, USA, pp 228–237Google Scholar
  38. 38.
    Liu J, Cao Y, Lin C-Y, Huang Y, Zhou M (2007) Low-quality product review detection in opinion summarization. In: Proceedings of the joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp 334–342, Poster paperGoogle Scholar
  39. 39.
    Maron M (2004) How adaptable are Swarms? April 18, 2004Google Scholar
  40. 40.
    Martin JC (1991) Introduction to languages and the theory of computation. McGraw Hill, New YorkGoogle Scholar
  41. 41.
    McGlohon M et al (2007) and blog evolution. In: The proceedings of ICWSM’2007 Boulder, Colorado, USAGoogle Scholar
  42. 42.
    Miller M, Sathi C, Wiesenthal D, Leskovec J, Potts C (2011) Sentiment flow through hyperlink networks. AAAI international conference on weblogs and social media (ICWSM)Google Scholar
  43. 43.
    Mishne G, de Rijke M (2006) Capturing global mood levels using blog posts. Proceedings of the AAAI 2006 spring symposium on computational approaches to analysing weblogsGoogle Scholar
  44. 44.
    Ounis I, Macdonald C, Soboroff I (2008) On the TREC blog track. In: Proceedings of AAAIGoogle Scholar
  45. 45.
    Pfafferott J, Herkel S (2007) Statistical simulation of user behaviour in low-energy office buildings. Solar Energy, Elsevier Sci 81(5):676–682.
  46. 46.
    Rubin VL, Liddy ED (2006) Assessing credibility of weblogs. In: AAAI symposium on computational approaches to analyzing weblogs (AAAI-CAAW), pp 187–190Google Scholar
  47. 47.
    Rudiger W, Barbara M, Per A (1996) Visual navigation in insects: coupling of egocentric and geocentric information. J Exp Biol 199: 129–140Google Scholar
  48. 48.
    Salganik M, Watts DJ (2007) Social influence, manipulation, and self-fulfilling prophecies in cultural markets. PreprintGoogle Scholar
  49. 49.
    Salganik M, Dodds P, Watts DJ (2006) Experimental study of inequality and unpredictability in an artificial cultural market. Science 331(5762): 854–856CrossRefGoogle Scholar
  50. 50.
    Subasic I, Berendt B(2010) Discovery of interactive graphs for understanding and searching time-indexed corpora. Knowl Inform Syst 23(3):293–319. Springer, LondonGoogle Scholar
  51. 51.
    Schmidt J (2007) Blogging practices: an analytical framework. J Comput Med Commun 12: 1409–1427CrossRefGoogle Scholar
  52. 52.
    Wilson EO (1971) The insect societies. Belknap Press, Cambridge 37.
  53. 53.
    Zhang Z, Varadarajan B (2006) Utility scoring of product reviews. In: Proceedings of the ACM SIGIR conference on information and knowledge management (CIKM), pp 51–57Google Scholar
  54. 54.
    Zhou L, Burgeon JK, Twitchell DP (2008) A longitudinal analysis of language behavior of deception in e-mail. In: Proceedings of intelligence and security informatics (ISI), number 2665 in Lecture Notes in Computer Science, pp 959Google Scholar

Copyright information

© Springer-Verlag London Limited 2012

Authors and Affiliations

  1. 1.Department of Computer ScienceBirla Institute of TechnologyMesraIndia
  2. 2.Information Science DepartmentUniversity of Arkansas at Little RockLittle RockUSA

Personalised recommendations