Social Network Analysis and Mining

, Volume 3, Issue 4, pp 1165–1177 | Cite as

Network text analysis of conceptual overlap in interviews, newspaper articles and keywords

  • Michael K. Martin
  • Juergen Pfeffer
  • Kathleen M. Carley
Original Article


We address the relative value of three information sources: costly interviews conducted in the field, newspaper articles that mention the areas in which the interviews took place, and keywords used to index the newspaper articles. Our research questions concern: (1) whether there is overlap in the information obtained from each source and (2) how the three information acquisition-extraction strategies employed can inform one another. This research project uses network text analysis as a framework for a mixed method approach to knowledge discovery. We show that concepts as well as the network structure of information obtained from interviews may be almost completely covered by networks representing the information extracted from a large number of news articles from a wide variety of sources, while the information overlap of interviews with article keywords was less straightforward. We also show how a conceptual network constructed from a small number of interviews can be used in a semantic pattern search that localizes interview topics in a larger network of news article topics. This approach thus uses newspaper articles to frame and elaborate the narratives of interviews in a larger cultural context.


Text mining Network text analysis Meta-network analysis News articles Interviews Keywords 



This work is supported in part by the Office of Naval Research (ONR), United States Navy (ONR MURI N000140811186, ONR MMT N00014060104). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Office of Naval Research or the U.S. government.


  1. Althaus SL, Edy JA, Phalen PF (2001) Using substitutes for full-text news stories in content analysis: which text is best? Am J Polit Sci 45(3):707–723CrossRefGoogle Scholar
  2. Althaus SL, Swigger N, Chernykh S, Hendry DJ, Wals SC, Tiwald C (2011) Assumed transmission in political science: a call for bringing description back in. J Polit 73(4):1065–1080CrossRefGoogle Scholar
  3. Barranco J, Wisler D (1999) Validity and systematicity of newspaper data in event analysis. Eur Sociol Rev 15(3):301–322CrossRefGoogle Scholar
  4. Baur N (2011) Mixing process-generated data in market sociology. Qual Quant 45:1233–1251CrossRefGoogle Scholar
  5. Bearman P, Stovel K (2000) Becoming a Nazi: a model for narrative networks. Poet 27(2–3):69–90CrossRefGoogle Scholar
  6. Bengston DN, Reed DP, Fan P, Goldhor-Wilcock A et al (2011) Rapid issue tracking: a method for taking the pulse of the public discussion of environmental policy. Environ Commun J Nat Cult 3(3):367–385CrossRefGoogle Scholar
  7. Bernard HR, Pelto PJ, Werner O, Boster J, Romney AK, Johnson A, Ember CR, Kasakoff A (1986) The construction of primary data in cultural anthropology. Curr Anthropol 27(4):382–396CrossRefGoogle Scholar
  8. Biroscak BJ, Smith PK, Post LA (2006) A practical approach to public health surveillance of violent deaths related to intimate partner relationships. Public Heal Rep 121(4):393–399Google Scholar
  9. Bonacich P (1972) Factoring and weighting approaches to status scores and clique identification. J Math Sociol 2:113–120CrossRefGoogle Scholar
  10. Brier A, Hopp B (2011) Computer assisted text analysis in social science. Qual Quant 45:103–128CrossRefGoogle Scholar
  11. Carley KM (1993) Coding choices for textual analysis: a comparison of content analysis and map analysis. Sociol Methodol 23:75–126CrossRefGoogle Scholar
  12. Carley KM (1997a) Network text analysis: the network position of concepts. In: Roberts C (ed) Text analysis for the social sciences. Lawerence Erlbaum Associates, Mahwah, NJ 79-100Google Scholar
  13. Carley KM (1997b) Extracting team mental models through textual Analysis. J Organ Behav 18:533–538CrossRefGoogle Scholar
  14. Carley KM (2002) Smart agents and organizations of the future. In: Lievrouw LA, Livingstone S (eds) The handbook of new media. Sage Pubn Inc, Thousand OaksGoogle Scholar
  15. Carley KM, Columbus D, Azoulay A (2012a) AutoMap user’s guide 2012. Carnegie Mellon University, School of Computer Science, Institute for Software Research, Technical Report, CMU-ISR-12-106Google Scholar
  16. Carley KM, Bigrigg MW, Diallo B (2012b) Data-to-model: a mixed initiative approach for rapid ethnographic assessment. Comput Math Organ Theory (in press)Google Scholar
  17. Chua AYK, Razikin K, Goh DH (2011) Social tags as news event detectors. J Inf Sci 37(1):3–18CrossRefGoogle Scholar
  18. Cucchiarelli A, D’Antonio F, Velardi P (2012) Semantically interconnected social networks. Soc Netw Anal Min 2(1):69–95CrossRefGoogle Scholar
  19. Davenport C, Ball P (2002) Views to a kill: exploring the implications of source selection in the case of Guatemalan state terror, 1977–1995. J Confl Resol 46(3):427–450CrossRefGoogle Scholar
  20. Deacon D (2007) Yesterday’s papers and today’s technology: digital newspaper archives and ‘push button’ content analysis. Eur J Commun 22(1):5–25CrossRefGoogle Scholar
  21. Diesner J, Carley KM (2005). Revealing social structure from texts: meta-matrix text analysis as a novel method for network text analysis. In: Narayanan VK, Armstrong DJ (eds) Causal mapping for information systems and technology research. Idea Group Publishing, HarrisburgGoogle Scholar
  22. Diesner J, Carley KM (2008) Conditional random fields for entity extraction and ontological text coding. Comput Math Organ Theory 14:248–262CrossRefzbMATHGoogle Scholar
  23. Diesner J, Carley KM (2010). Mapping socio-cultural networks of Sudan from open-source, large-scale text data. In: Proceedings of the 29th annual conference of the Sudan Studies Association, West Lafayette, May 2010Google Scholar
  24. Dixon-Woods M, Seale C, Young B, Findlay M, Heney D (2003) Representing childhood cancer: accounts from newspapers and parents. Sociol Health Illn 25(2):143–164CrossRefGoogle Scholar
  25. Earl J, Martin A, McCarthy JD, Soule SA (2004) The use of newspaper data in the study of collective action. Annu Rev Sociol 30:65–80CrossRefGoogle Scholar
  26. Fan W, Wallace L, Rich S, Zhang Z (2006) Tapping the power of text mining. Commun ACM 49(9):77–82CrossRefGoogle Scholar
  27. Franzosi R (1987) The press as a source of socio-historical data: issues in the methodology of data collection from newspapers. Hist Methods 20:5–16CrossRefGoogle Scholar
  28. Freeman LC (1977) A set of measures of centrality based on betweenness. Sociometry 40:35–41CrossRefGoogle Scholar
  29. Freeman LC (1979) Centrality in social networks: conceptual clarification. Soc Netw 1:215–239CrossRefGoogle Scholar
  30. Gupta A (1995) Blurred boundaries: the discourse of corruption, the culture of politics, and the imagined state. Am Ethnol 22(2):375–402CrossRefGoogle Scholar
  31. Hakam J (2009) The “cartoons controversy”: a critical discourse analysis of English-language Arabic newspaper discourse. Discourse Soc 20(1):33–57CrossRefGoogle Scholar
  32. Hanneman RA, Shelton CR (2010) Applying modality and equivalence concepts to pattern finding in social process-produced data. Soc Netw Anal Min 1(1):59–72CrossRefGoogle Scholar
  33. Jin Y, Lin CY, Matsuo Y, Ishizuka M (2012) Mining dynamic social networks from public news articles for company value prediction. Soc Netw Anal Min 2(3):217–228CrossRefGoogle Scholar
  34. Johnson RB, Onwuegbuzie AJ, Turner LA (2007) Toward a definition of mixed methods research. J Mix Methods Res 1(2):112–133CrossRefGoogle Scholar
  35. Kas M, Carley KM, Carley LR (2012) Trends in science networks: understanding structures and statistics of scientific networks. Soc Netw Anal Min 2(2):169–187CrossRefGoogle Scholar
  36. Kirilenko A, Stepchenkova S, Romsdahl R, Mattis K (2012) Computer-assisted analysis of public discourse: a case study of the precautionary principle in the US and UK press. Qual Quan 46:501–522CrossRefGoogle Scholar
  37. McCall RB, Appelbaum MI (1991) Some issues of conducting secondary analyses. Dev Psychol 27(6):911–917CrossRefGoogle Scholar
  38. Mingers J (2003) The paucity of multimethod research: a review of the information systems literature. Inf Syst J 13:233–249CrossRefGoogle Scholar
  39. Oliver PE, Myer DT (1999) How events enter the public sphere: conflict, location, and sponsorship in local newspaper coverage of public events. Am J Sociol 105(1):38–87CrossRefGoogle Scholar
  40. Pfeffer J, Carley KM (2012a) Rapid modeling and analyzing networks extracted from pre-structured news articles. Comput Math Organ Theory. doi: 10.1007/s10588-012-9122-1
  41. Pfeffer J, Carley KM (2012b) Social NEtworks, social media, social change. In: Proceedings of the 2nd international conference on cross-cultural decision making: focus 2012, San FranciscoGoogle Scholar
  42. Popping R (2003) Knowledge graphs and network text analysis. Soc Sci Inf 42(1):91–106CrossRefGoogle Scholar
  43. Ready J, White MD, Fisher C (2006) Shock value: a comparative analysis of news reports and official police records on TASER deployments. Policing An Int J Police Strateg Manag 32(1):148–170Google Scholar
  44. Roberts CW (2000) A conceptual framework for quantitative text analysis: on joining probabilities and substantive inferences about texts. Qual Quan 34:259–274CrossRefGoogle Scholar
  45. Sandelowski M, Voils CI, Knafl G (2009) On quantizing. J Mix Methods Res 3(3):208–222CrossRefGoogle Scholar
  46. Small ML (2011) How to conduct a mixed methods study: recent trends in a rapidly growing literature. Annu Rev Sociol 37:57–86CrossRefGoogle Scholar
  47. Snyder D, Kelly WR (1977) Conflict intensity, media sensitivity and the validity of newspaper data. Am Sociol Rev 42(1):105–123CrossRefGoogle Scholar
  48. Tremblay MC, Berndt DJ, Luther, Foulis SL, French DD et al (2009) Identifying fall-related injuries: text mining the electronic medical record. Inf Technol Manag 10:253–265CrossRefGoogle Scholar
  49. Watts D, Strogatz S (1998) Collective dynamics of small world networks. Nat 393:440–442CrossRefGoogle Scholar
  50. Weaver DA, Bimber B (2008) Finding news stories: a comparison of searches using LexisNexis and Google. Journalism Mass Commun Q 85(3):515–530CrossRefGoogle Scholar
  51. Weiner M (2009) Elite versus grassroots: disjunctures between parents’ and civil rights organizations demands for New York City’s public schools. Sociol Q 50(1):89–119CrossRefGoogle Scholar
  52. Woolley JT (2000) Using media-based data in studies of politics. Am J Political Sci 44(1):156–173CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Wien 2013

Authors and Affiliations

  • Michael K. Martin
    • 1
  • Juergen Pfeffer
    • 1
  • Kathleen M. Carley
    • 1
  1. 1.Carnegie Mellon UniversityPittsburghUSA

Personalised recommendations