Skip to main content
Log in

Analyzing collective behavior from blogs using swarm intelligence

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

With the rapid growth of the availability and popularity of interpersonal and behavior-rich resources such as blogs and other social media avenues, emerging opportunities and challenges arise as people now can, and do, actively use computational intelligence to seek out and understand the opinions of others. The study of collective behavior of individuals has implications to business intelligence, predictive analytics, customer relationship management, and examining online collective action as manifested by various flash mobs, the Arab Spring (2011) and other such events. In this article, we introduce a nature-inspired theory to model collective behavior from the observed data on blogs using swarm intelligence, where the goal is to accurately model and predict the future behavior of a large population after observing their interactions during a training phase. Specifically, an ant colony optimization model is trained with behavioral trend from the blog data and is tested over real-world blogs. Promising results were obtained in trend prediction using ant colony based pheromone classier and CHI statistical measure. We provide empirical guidelines for selecting suitable parameters for the model, conclude with interesting observations, and envision future research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Adamic LA, Glance N (2005) The political blogosphere and the 2004. US election: divided they blog. In: LinkKDD ’05: proceedings of the 3rd international workshop on link discovery, pp 36–43

  2. Adar E, Adamic LA (2005) Tracking information epidemics in blogspace

  3. Agarwal N, Liu H (2008) Blogosphere: research issues, tools, and applications. SIGKDD Explor 10(1): 18–31

    Article  Google Scholar 

  4. Agarwal N, Galan M, Liu H, Subramanya S (2010) WisColl: collective wisdom based blog clustering. J Inform Sci 180(1): 39–61

    Article  Google Scholar 

  5. Agarwal N, Lim M, Wigand RT (2011) Finding her master’s voice: the power of collective action among female muslim bloggers. In: The 19th European conference on information systems (ECIS2011). June 9–11, Helsinki, Finland

  6. Agarwal N, Liu H, Tang L, Yu SP (2012) Modeling blogger influence in a community social network analysis and mining. Springer, New York

    Google Scholar 

  7. Battelle J (2005) The search: how Google and its rivals rewrote the rules of business and transformed our culture. Portfolio, New York

    Google Scholar 

  8. Bonabeau E, Dorigo M, Theraulaz G (1999) Swarm intelligence: from natural to artificial systems, Santa Fe institute in the sciences of complexity. Oxford University Press, New York

    Google Scholar 

  9. Bortree DS (2005) Presentation of self on the web: an ethnographic study of teenage girls’ weblogs. Educ Commun Inform 5(1): 25–39

    Article  Google Scholar 

  10. Buckingham, D, Willett, R (eds) (2006) Digital generations: children, young people, and the new media. Routledge, Cambridge

    Google Scholar 

  11. Bullnheimer B, Hartl RF, Strauss C (1999) An improved ant system algorithm for the vehicle routing problem. Ann Oper Res 89: 319–328

    Article  MathSciNet  MATH  Google Scholar 

  12. Cao L (2010) In-depth behavior understanding and use: the behavior informatics approach. Inform Sci 180(17): 3067–3085

    Article  Google Scholar 

  13. Cao, L, Yu, SP (eds) (2012) Behavior computing. Springer, New York

    Google Scholar 

  14. Cao L, Zhao Y, Figueiredo F, Ou Y, Luo D (2007) Mining high impact exceptional behavior patterns. In: Industry track with 2007 Pacific-Asia conference on 864. Knowledge Discovery and Data Mining, LNCS4819, pp. 56–63

  15. Costa D, Hertz A (1997) Ants can colour graphs. J Oper Res Soc 48: 295–305

    MATH  Google Scholar 

  16. Costa LF, Rodrigues FA, Travieso G, Boas PRV (2007) Characterization of complex networks: a survey of measurements. Adv Phys 56: 167–242

    Article  Google Scholar 

  17. Das S, Abraham A (2010) Pattern clustering using a swarm intelligence approach. In: Maimon O, Rokach L (eds) Data mining and knowledge discovery handbook. Springer, Germany. ISBN 978-0-387-09822-7, pp 469–504

  18. Dorigo M, Blum C (2005) Ant colony optimization theory: a survey. Theoretical 733 computer science, pp 243–278

  19. Dorigo M., Di Caro G., Gambardella LM. (1999) Ant algorithms for discrete optimization. Artif Life 5(2): 137–172

    Article  Google Scholar 

  20. Embrey TR (2002) You blog, we blog: a guide to how teacher-librarians can use weblogs to build communication and research skills. Teach Libr 30(2): 7–9

    Google Scholar 

  21. Fan T-K, Chang C-H (2010) Sentiment-oriented contextual advertising. Knowl Inform Syst 23(3): 321–344. Springer, London

    Google Scholar 

  22. Feng S, Wang D, Yu G, Gao W, Wong K-F (2011) Extracting common emotions from blogs based on fine-grained sentiment clustering. Knowl Inform Syst 27(2):281–302. Springer, London

    Google Scholar 

  23. GamBardella E., Tailard E., Dorigo M. (1999) Ant colonies for the quadratic assignment problem. J Oper Res Soc 50: 167–176

    MATH  Google Scholar 

  24. Ghose A, Ipeirotis PG (2007) Designing novel review ranking systems: predicting usefulness and impact of reviews. In: Proceedings of the international conference on electronic commerce (ICEC), Invited paper

  25. Glance N, Hurst M, Nigam K, Siegler M, Stockton R, Tomokiyo T (2005) Deriving marketing intelligence from online discussion. Conference on knowledge discovery in data

  26. Goldenberg J., Libai B., Muller E. (2001) Talk of the network: a complex systems look at the underlying process of word-of-mouth. Mark Lett 12(3): 211–213

    Article  Google Scholar 

  27. Goli M, Rouhani Rankoohi S (2011) A new vertical fragmentation algorithm based on ant collective behavior in distributed database systems. Knowl Inform Syst. Springer, London. Published online on February 24, doi:10.1007/s10115-011-0384-6

  28. Gruhl D, Guha R, Kumar R, Novak J, Tomkins A (2005) The predictive power of online chatter. Conference on knowledge discovery in data

  29. Herring SC, Kouper I, Paolillo JC, Scheidt LA, Tyworth M, Welsch P, Wright E, Yu N (2005) Conversations in the blogosphere: an analysis “from the bottom up”. In: Proceedings of the thirty-eighth Hawaii international conference on system sciences (HICSS-38). IEEE Press, Los Alamitos, pp 1–11

  30. Horrigan JA (2008) Online shopping. Pew internet and American life project report

  31. Jou S-H, Kao S-J (2002) Agent-based infrastructure and an application to internet information gathering. Knowl Inform Syst 4(1):80–95. Springer, London

    Google Scholar 

  32. Karimi Adl R, Rouhani Rankoohi S (2009) A new ant colony optimization based algorithm for data allocation problem in distributed databases. Knowl Inform Syst 20(3):349–373. Springer, London

    Google Scholar 

  33. Kelsey Group (2007) Online consumer-generated reviews have significant Impact on offline purchase behavior, Press Release, November 2007

  34. Kim S-M, Pantel P, Chklovski T, Pennacchiotti M (2006) Automatically assessing review helpfulness. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), Sydney, Australia, pp 423–430, July 2006

  35. Kumar R, Novak J, Raghavan P, Tomkins A (2003) On the bursty evolution of blogspace. In: WWW ’03, pp 568–576. ACM Press, New York

  36. Leskovec J, McGlohon M, Faloutsos C, Glance N, Hurst M (2006) Cascading behavior in large blog graphs: patterns and a model. Technical report CMU-ML-06-113, October 2006

  37. Leskovec J, Adamic LA, Huberman BA (2006) The dynamics of viral marketing. In: EC ’06: proceedings of the 7th ACM conference on electronic commerce. ACM Press, New York, NY, USA, pp 228–237

  38. Liu J, Cao Y, Lin C-Y, Huang Y, Zhou M (2007) Low-quality product review detection in opinion summarization. In: Proceedings of the joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp 334–342, Poster paper

  39. Maron M (2004) How adaptable are Swarms? April 18, 2004

  40. Martin JC (1991) Introduction to languages and the theory of computation. McGraw Hill, New York

    Google Scholar 

  41. McGlohon M et al (2007) and blog evolution. In: The proceedings of ICWSM’2007 Boulder, Colorado, USA

  42. Miller M, Sathi C, Wiesenthal D, Leskovec J, Potts C (2011) Sentiment flow through hyperlink networks. AAAI international conference on weblogs and social media (ICWSM)

  43. Mishne G, de Rijke M (2006) Capturing global mood levels using blog posts. Proceedings of the AAAI 2006 spring symposium on computational approaches to analysing weblogs

  44. Ounis I, Macdonald C, Soboroff I (2008) On the TREC blog track. In: Proceedings of AAAI

  45. Pfafferott J, Herkel S (2007) Statistical simulation of user behaviour in low-energy office buildings. Solar Energy, Elsevier Sci 81(5):676–682. http://www.behaviorinformatics.org/

  46. Rubin VL, Liddy ED (2006) Assessing credibility of weblogs. In: AAAI symposium on computational approaches to analyzing weblogs (AAAI-CAAW), pp 187–190

  47. Rudiger W, Barbara M, Per A (1996) Visual navigation in insects: coupling of egocentric and geocentric information. J Exp Biol 199: 129–140

    Google Scholar 

  48. Salganik M, Watts DJ (2007) Social influence, manipulation, and self-fulfilling prophecies in cultural markets. Preprint

  49. Salganik M, Dodds P, Watts DJ (2006) Experimental study of inequality and unpredictability in an artificial cultural market. Science 331(5762): 854–856

    Article  Google Scholar 

  50. Subasic I, Berendt B(2010) Discovery of interactive graphs for understanding and searching time-indexed corpora. Knowl Inform Syst 23(3):293–319. Springer, London

    Google Scholar 

  51. Schmidt J (2007) Blogging practices: an analytical framework. J Comput Med Commun 12: 1409–1427

    Article  Google Scholar 

  52. Wilson EO (1971) The insect societies. Belknap Press, Cambridge 37. http://www.radioequalizer.blogspot.com

  53. Zhang Z, Varadarajan B (2006) Utility scoring of product reviews. In: Proceedings of the ACM SIGIR conference on information and knowledge management (CIKM), pp 51–57

  54. Zhou L, Burgeon JK, Twitchell DP (2008) A longitudinal analysis of language behavior of deception in e-mail. In: Proceedings of intelligence and security informatics (ISI), number 2665 in Lecture Notes in Computer Science, pp 959

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soumya Banerjee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Banerjee, S., Agarwal, N. Analyzing collective behavior from blogs using swarm intelligence. Knowl Inf Syst 33, 523–547 (2012). https://doi.org/10.1007/s10115-012-0512-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-012-0512-y

Keywords

Navigation