Social Network Analysis and Mining

, Volume 2, Issue 2, pp 139–162 | Cite as

Modeling blogger influence in a community

  • Nitin AgarwalEmail author
  • Huan Liu
  • Lei Tang
  • Philip S. Yu
Original Article


Blogging has become a popular and convenient way to communicate, publish information, share preferences, voice opinions, provide suggestions, report news, and form virtual communities in the Blogosphere. The blogosphere obeys a power law distribution with very few blogs being extremely influential and a huge number of blogs being largely unknown. Regardless of a (multi-author) blog being influential or not, there are influential bloggers. However, the sheer number of such blogs makes it extremely challenging to study each one of them. One way to analyze these blogs is to find influential bloggers and consider them as the community representatives. Influential bloggers can impact fellow bloggers in various ways. In this paper, we study the problem of identifying influential bloggers. We define influential bloggers, investigate their characteristics, discuss the challenges with identification, develop a model to quantify their influence, and pave the way for further research leading to more sophisticated models that enable categorization of various types of influential bloggers. To highlight these issues, we conduct experiments using data from blogs, evaluate multiple facets of the problem, and present a unique and objective evaluation strategy given the subjectivity in defining the influence, in addition to various other analytical capabilities. We conclude with interesting findings and future work.


Social network Blogosphere Influence Influential bloggers Evaluation 



This research was funded in part by the National Science Foundations Social-Computational Systems (SoCS) Program within the Directorate for Computer and Information Science and Engineerings Division of Information and Intelligent Systems (Award numbers: IIS-1110868 and IIS-1110649), the US Office of Naval Research (Grant number: N000141010091), and the US Air Force Office of Scientific Research (Grant number: FA95500810132). We gratefully acknowledge this support.


  1. Agarwal N, Kumar S, Lim M, Liu H (2009a) Mapping socio-cultural dynamics in indonesian blogosphere. In: 3rd AAAI International Conference on Computational Cultural Dynamics (ICCCD09), pp 37–44Google Scholar
  2. Agarwal N, Kumar S, Liu H, Woodward M (2009b) Blogtrackers: a tool for sociologists to track and analyze blogosphere. In: Proceedings of the 3rd International AAAI Conference on Weblogs and Social Media (ICWSM)Google Scholar
  3. Agarwal N, Liu H, Murthy S, Sen A, Wang X (2009c) A social identity approach to identify familiar strangers in a social network. In: Proceedings of the Third International AAAI Conference of Weblogs and Social Media, pp 2–9Google Scholar
  4. Agarwal N, Liu H, Salerno JJ, Yu PS (2007) Searching for familiar strangers on blogosphere: problems and challenges. In: NGDMGoogle Scholar
  5. Anderson C (2006) The long tail: why the future of business is selling less of more. Hyperion, New YorkGoogle Scholar
  6. Argamon S, Koppel M, Fine J, Shimoni A (2003) Gender, genre, and writing style in formal written texts. TextInterdiscip J Study Discourse 23(3):321–346CrossRefGoogle Scholar
  7. Berelson B, Lazarsfeld P, McPhee W (1986) Voting: a study of opinion formation in a presidential campaign. University of Chicago Press, ChicagoGoogle Scholar
  8. Bonacich P (1987) Power and centrality: a family of measures. Am J Sociol 92(5):1170–1182CrossRefGoogle Scholar
  9. Brin S, Page L (1998) The anatomy of a large-scale hypertextual Web search engine. Comput Netw ISDN Syst 30(1–7):107–117CrossRefGoogle Scholar
  10. Brin S, Page L (1998) The anatomy of a large-scale hypertextual Web search engine. In: Proceedings of the seventh international conference on World Wide Web, pp 107–117Google Scholar
  11. Chen C, Paul R (2001) Visualizing a knowledge domain’s intellectual structure. Computer 34(3):65–71CrossRefGoogle Scholar
  12. Chen W, Wang Y, Yang S (2009) Efficient influence maximization in social networks. In: KDD ’09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, pp 199–208Google Scholar
  13. Chin A, Chignell M (2006) A social hypertext model for finding community in blogs. In: HYPERTEXT ’06: Proceedings of the seventeenth conference on Hypertext and hypermedia, ACM Press, New York, pp 11–22Google Scholar
  14. Coffman T, Marcus S (2004) Dynamic classification of groups through social network analysis and hmms. In: Proceedings of IEEE Aerospace ConferenceGoogle Scholar
  15. Coleman J, Katz E, Menzel H (1966) Medical innovation: a diffusion study. Bobbs-Merrill Co, IndianaGoogle Scholar
  16. Drezner D, Farrell H (2004) The power and politics of blogs. In: American Political Science Association Annual ConferenceGoogle Scholar
  17. Faloutsos M, Faloutsos P, Faloutsos C (1999) On power-law relationships of the internet topology. In: Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication, pp 251–262Google Scholar
  18. Fensterer GD (2007) Planning and assessing stability operations: a proposed value focus thinking approach. PhD thesis, Air Force Institute of TechnologyGoogle Scholar
  19. Gill KE (2004) How can we measure the influence of the blogosphere? In: Proceedings of the WWW’04: workshop on the Weblogging Ecosystem: Aggregation, Analysis and DynamicsGoogle Scholar
  20. Gillmor D (2006) We the media: grassroots journalism by the people, for the people. O’Reilly, SebastopolGoogle Scholar
  21. Goldenberg J, Libai B, Muller E (2001) Talk of the network: a complex systems look at the underlying process of word-of-mouth. Mark Lett 12:211–223CrossRefGoogle Scholar
  22. Golub G, Van Loan C (1996) Matrix computations. 3rd edn. Johns Hopkins University Press, BaltimoreGoogle Scholar
  23. Goyal A, Bonchi F, Lakshamanan LVS (2010) Learning influence probabilities in social networks. In: WSDMGoogle Scholar
  24. Gruhl D, Guha R, Kumar R, Novak J, Tomkins A (2005) The predictive power of online chatter. In: KDD ’05: Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, ACM Press, New York, pp 78–87Google Scholar
  25. Gruhl D, Liben-Nowell D, Guha R, Tomkins A (2004) Information diffusion through blogspace. SIGKDD Explor Newsl 6(2):43–52CrossRefGoogle Scholar
  26. Hu M, Lim E, Sun A, Lauw H, Vuong B (2007) Measuring article quality in wikipedia: models and evaluation. In: Proceedings of the Sixteenth ACM Conference on Conference on information and Knowledge Management, ACM, New York, pp 243–252Google Scholar
  27. Java A, Kolari P, Finin T, Oates T (2006) Modeling the spread of influence on the blogosphere. In: Proceedings of the 15th International World Wide Web ConferenceGoogle Scholar
  28. Katz E (1957) The two-step flow of communication: an up-to-date report on an hypothesis. Public Opin Q 21(1):61–78CrossRefGoogle Scholar
  29. Katz E, Lazarsfeld P (1955) Personal influence: the part played by people in the flow of mass communications. Free Press, Glencoe, ILGoogle Scholar
  30. Kavanaugh A, Zin TT, Carroll JM, Schmitz J, Manuel Pérez-Qui N, Isenhour P (2006) When opinion leaders blog: new forms of citizen interaction. In: Proceedings of the 2006 international conference on Digital government research, ACM, New York, pp 79–88Google Scholar
  31. Keeney RL, Raiffa H (1993) Decisions with multiple objectives: preferences and value tradeoffs. Cambridge University Press, CambridgeGoogle Scholar
  32. Keller E, Berry J (2003) One American in ten tells the other nine how to vote, where to eat and, what to buy. They are The Influentials. The Free Press, New YorkGoogle Scholar
  33. Kempe D, Kleinberg J, Tardos E (2003) Maximizing the spread of influence through a social network. In: Proceedings of the KDD, ACM Press, New York, pp 137–146Google Scholar
  34. Kendall M (1938) A new measure of rank correlation. Biometrika 30:81–89MathSciNetzbMATHGoogle Scholar
  35. Kleinberg J (1998) Authoritative sources in a hyperlinked environment. In: 9th ACM-SIAM Symposium on Discrete AlgorithmsGoogle Scholar
  36. Knoke D, Burt R (1983) Prominence. In: Applied network analysis, pp 195–222Google Scholar
  37. Kolari P, Finin T, Joshi A (2006) SVMs for the blogosphere: Blog identification and splog detection. In: AAAI Spring Symposium on Computational Approaches to Analyzing WeblogsGoogle Scholar
  38. Kritikopoulos A, Sideri M, Varlamis I (2006) Blogrank: ranking weblogs based on connectivity and similarity features. In: AAA-IDEA ’06: Proceedings of the 2nd international workshop on Advanced architectures and algorithms for internet delivery and applications, ACM Press, New YorkGoogle Scholar
  39. Lazarsfeld P, Berelson B, Gaudet H (1944) The People’s Choice. How the Voter Makes up His Mind in a Presidential Campaign 1944. Columbia University Press, New YorkGoogle Scholar
  40. Leskovec J, Krause A, Guestrin C, Faloutsos C, VanBriesen J, Glance N (2007) Cost-effective outbreak detection in networks. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, pp 420–429Google Scholar
  41. Leskovec J, McGlohon M, Faloutsos C, Glance N, Hurst M (2007) Cascading behavior in large blog graphs. In: SIAM International Conference on Data MiningGoogle Scholar
  42. Lin Y-R, Sundaram H, Chi Y, Tatemura J, Tseng BL (2007) Splog detection using self-similarity analysis on blog temporal dynamics. In: Proceedings of the 3rd international workshop on Adversarial information retrieval on the web (AIRWeb), ACM press, New York, pp 1–8Google Scholar
  43. Merton R (1968) Social theory and social structure. Free Press, New YorkGoogle Scholar
  44. Mimno D, McCallum A (2007) Mining a digital library for influential authors. In: JCDL ’07: Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries, ACM, New York, pp 105–106Google Scholar
  45. Mishne G, de Rijke M (2006) Deriving wishlists from blogs show us your blog, and we’ll tell you what books to buy. In: Proceedings of the 15th international conference on World Wide Web, ACM Press, New York, pp 925–926Google Scholar
  46. Moed H (2005) Citation analysis in research evaluation. Kluwer Academic Publishers, DordrechtGoogle Scholar
  47. Motwani R, Raghavan P (1995) Randomized algorithms. Cambridge University Press, CambridgezbMATHGoogle Scholar
  48. Nakajima S, Tatemura J, Hino Y, Hara Y, Tanaka K (2005) Discovering important bloggers based on analyzing blog threads. In: Annual Workshop on the Weblogging EcosystemGoogle Scholar
  49. Ni X, Xue G-R, Ling X, Yu Y, Yang Q (2007) Exploring in the weblog space by detecting informative and affective articles. In: WWW ’07: Proceedings of the 16th international conference on World Wide Web, ACM, New York, pp 281–290Google Scholar
  50. O’Reilly T (2005) What is Web 2.0 - design patterns and business models for the next generation of software.
  51. Podolny J (2005) Status signals: a sociological study of market competition. Princeton University Press, PrincetonGoogle Scholar
  52. Richardson M, Domingos P (2002) Mining knowledge-sharing sites for viral marketing. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge Discovery and Data mining, ACM Press, New York, pp 61–70Google Scholar
  53. Rogers E (1995) Diffusion of innovations. Free Press, New YorkGoogle Scholar
  54. Rogers E, Shoemaker F (1971) Communication of innovations: a cross-cultural approach. Free Press, New YorkGoogle Scholar
  55. Scoble R, Israel S (2006) Naked conversations: how blogs are changing the way businesses talk with customers. Wiley, LondonGoogle Scholar
  56. Song X, Chi Y, Hino K, Tseng B (2007) Identifying opinion leaders in the blogosphere. In: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, ACM, New York, pp 971–974Google Scholar
  57. Spearman C (1904) The proof and measurement of association between two things. Am J Psychol 15:72–101CrossRefGoogle Scholar
  58. Stefanone M, Jang C (2008) Writing for friends and family: the interpersonal nature of blogs. J ComputMediat Commun 13(1):123–140CrossRefGoogle Scholar
  59. Tang J, Sun J, Wang C, Yang Z (2009) Social influence analysis in large-scale networks. In: KDD ’09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, pp 807–816Google Scholar
  60. Thelwall M (2006) Bloggers under the London attacks: top information sources and topics. In: Proceedings of the 3rd annual workshop on webloging ecosystem: aggreation, analysis and dynamicsGoogle Scholar
  61. Turner J (1991) Social influence. Thomson Brooks/Cole, BelmontGoogle Scholar
  62. Watts D (2007) Challenging the influentials hypothesis. WOMMA Meas Word Mouth 3:201–211Google Scholar
  63. Watts D, Dodds P (2007) Influentials, networks, and public opinion formation. J Consum Res 34(4):441CrossRefGoogle Scholar
  64. Watts DJ, Peretti J (2007) Viral marketing in the real world. Harvard Business Review, CambridgeGoogle Scholar
  65. Weng J, Peng Lim E, Jiang J, He Q (2010) Twitterrank: finding topic-sensitive influential twitterers. In: WSDMGoogle Scholar
  66. Yin X, Han J, Yu PS (2007) Truth discovery with multiple conflicting information providers on the web. In: IEEE Transactions on Knowledge and Data Engineering (TKDE)Google Scholar
  67. Zheng R, Li J, Chen H, Huang Z (2006) A framework for authorship identification of online messages: writing-style features and classification techniques. J Am Soc Inf Sci Technol 57(3):378–393CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  • Nitin Agarwal
    • 1
    Email author
  • Huan Liu
    • 2
  • Lei Tang
    • 3
  • Philip S. Yu
    • 4
  1. 1.University of Arkansas at Little RockLittle RockUSA
  2. 2.Arizona State UniversityTempeUSA
  3. 3.Yahoo! LabsSanta ClaraUSA
  4. 4.University of Illinois at ChicagoChicagoUSA

Personalised recommendations