Information Systems Frontiers

, Volume 20, Issue 6, pp 1191–1202 | Cite as

Scaling Up Integrated Structural and Content-Based Network Analysis

  • Jennifer GolbeckEmail author
  • Jeff Gerhard
  • Farrah O’Colman
  • Ryan O’Colman


Structural analysis of social networks can provide important insights into the clusters and important nodes. However, it is silent on the content-based reasons for importance or commonality. This additional level of insight requires sampling content from nodes and processing it to distill new insights. That is done effectively by human analysts, but as networks grow into Big Data scale, human analysis is not possible. This raises the question of whether automated techniques can mimic the same results humans find. In this paper, we demonstrate how topic modeling can be applied, filtered, and adapted to produce easy-to-understand keywords that represent important clusters in a network. Those keywords reflect the insights achieved by human analysts doing a manual content-based analysis of the network features. While humans should never be removed from the analysis process, this work shows how automated techniques can be integrated to scale humans’ ability to gain insights in large networks.


Social network analysis Topic modeling Big data 



This work was conducted with the support of the National Science Foundation award 1546829.


  1. Agichtein, E., Brill, E., & Dumais, S. (2006). Improving web search ranking by incorporating user behavior information. In Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval (pp. 19–26). ACM.Google Scholar
  2. Amelio, A., & Pizzuti, C. (2013). Community mining in signed networks: a multiobjective approach. In Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining (pp. 95–99). ACM.Google Scholar
  3. Beutel, A., Akoglu, L., & Faloutsos, C. (2015). Graph-based user behavior modeling: from prediction to fraud detection. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 2309–2310). ACM.Google Scholar
  4. Black, L.W., Welser, H.T., Cosley, D., & DeGroot, J.M. (2011). Self-governance through group discussion in wikipedia: measuring deliberation in online groups. Small Group Research, p. 1046496411406137.Google Scholar
  5. Blondel, V.D., Guillaume, J.-L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008.CrossRefGoogle Scholar
  6. Brandes, U. (2001). A faster algorithm for betweenness centrality*. Journal of Mathematical Sociology, 25(2), 163–177.CrossRefGoogle Scholar
  7. Brandes, U., Kenis, P., Lerner, J., & van Raaij, D. (2009). Network analysis of collaboration structure in wikipedia. In Proceedings of the 18th international conference on world wide web (pp. 731–740). ACM.Google Scholar
  8. Buntain, C., & Golbeck, J. (2014). Identifying social roles in reddit using network structure. In Proceedings of the companion publication of the 23rd international conference on world wide web companion. International World Wide Web Conferences Steering Committee (pp. 615–620).Google Scholar
  9. Cabunducan, G., Castillo, R., & Lee, J.B. (2011). Voting behavior analysis in the election of wikipedia admins. In International conference on advances in social networks analysis and mining (ASONAM), 2011 (pp. 545–547). IEEE.Google Scholar
  10. Creswell, J.W. (2013). Research design: qualitative, quantitative, and mixed methods approaches. Sage Publications.Google Scholar
  11. Crossley, N. (2010). The social world of the network. Combining qualitative and quantitative elements in social network analysis. Sociologica, 4(1), 0–0.Google Scholar
  12. Domínguez, S., & Hollstein, B. (2014). Mixed methods social networks research: design and applications, Vol. 36. Cambridge University Press.Google Scholar
  13. Eleta, I.M., & Golbeck, J. (2014). Multilingual use of twitter: social networks at the language frontier. Computers in Human Behavior, 41, 424–432.CrossRefGoogle Scholar
  14. Fisher, D., Smith, M., & Welser, H.T. (2006). You are who you talk to: detecting roles in usenet newsgroups. In Proceedings of the 39th annual Hawaii international conference on system sciences, HICSS ’06 (Vol. 3, p. 59.2). Washington, DC, USA: IEEE Computer Society.Google Scholar
  15. Golbeck, J. (2006). Filmtrust: movie recommendations from semantic web-based social networks. In Consumer communications and networking conference (Vol. 2, pp. 1314–1315). Citeseer.Google Scholar
  16. Gómez, V., Kaltenbrunner, A., & López, V. (2008). Statistical analysis of the social network and discussion threads in slashdot. In Proceedings of the 17th international conference on world wide web (pp. 645–654). ACM.Google Scholar
  17. Gupta, M., Gao, J., Yan, X., Cam, H., & Han, J. (2013). On detecting association-based clique outliers in heterogeneous information networks. In IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), 2013 (pp. 108– 115). IEEE.Google Scholar
  18. Huang, B., Kimmig, A., Getoor, L., & Golbeck, J. (2013). A flexible framework for probabilistic models of social trust. In International conference on social computing, behavioral-cultural modeling, and prediction (pp. 265–273). Springer.Google Scholar
  19. Iba, T., Nemoto, K., Peters, B., & Gloor, P.A. (2010). Analyzing the creative editing behavior of wikipedia editors: through dynamic social network analysis. Procedia-Social and Behavioral Sciences, 2(4), 6441–6456.CrossRefGoogle Scholar
  20. Kane, G.C. (2009). It’s a network, not an encyclopedia: a social network perspective on wikipedia collaboration. In Academy of management proceedings, number 1 (pp. 1–6). Academy of Management.Google Scholar
  21. Lambiotte, R., Delvenne, J.-C., & Barahona, M. (2008). Laplacian dynamics and multiscale modular structure in networks. arXiv preprint arXiv:0812.1770.
  22. Laniado, D., & Tasso, R. (2011). Co-authorship 2.0: Patterns of collaboration in wikipedia. In Proceedings of the 22nd ACM conference on hypertext and hypermedia (pp. 201–210). ACM.Google Scholar
  23. Laniado, D., Tasso, R., Volkovich, Y., & Kaltenbrunner, A. (2011). When the wikipedians talk: network and tree structure of wikipedia discussion pages. In ICWSM.Google Scholar
  24. Lincoln, M. (2016). Modeling the network of dutch and flemish print production, 1550–1750. PhD Dissertation, University of Maryland.Google Scholar
  25. Manca, M., Boratto, L., & Carta, S. (2015). Behavioral data mining to produce novel and serendipitous friend recommendations in a social bookmarking system. Information Systems Frontiers, 1–15.Google Scholar
  26. Mobasher, B., Cooley, R., & Srivastava, J. (2000). Automatic personalization based on web usage mining. Communications of the ACM, 43(8), 142–151.CrossRefGoogle Scholar
  27. Nonnecke, B., & Preece, J. (2000). Lurker demographics: counting the silent. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 73–80). ACM.Google Scholar
  28. Pal, A., & Counts, S. (2011). Identifying topical authorities in microblogs. In Proceedings of the fourth ACM international conference on web search and data mining (pp. 45–54). ACM.Google Scholar
  29. Schönhofen, P. (2009). Identifying document topics using the wikipedia category network. Web Intelligence and Agent Systems: An International Journal, 7(2), 195–207.Google Scholar
  30. Williams, J., Heiser, R., & Chinn, S.J. (2012). Social media posters and lurkers: the impact on team identification and game attendance in minor league baseball. Journal of Direct, Data and Digital Marketing Practice, 13(4), 295–310.CrossRefGoogle Scholar
  31. Ziegler, C.-N., & Lausen, G. (2005). Propagation models for trust and distrust in social networks. Information Systems Frontiers, 7(4-5), 337–358.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  • Jennifer Golbeck
    • 1
    Email author
  • Jeff Gerhard
    • 1
  • Farrah O’Colman
    • 1
  • Ryan O’Colman
    • 1
  1. 1.University of MarylandCollege ParkUSA

Personalised recommendations