The American Sociologist

, Volume 47, Issue 1, pp 12–35 | Cite as

Sociology in the Era of Big Data: The Ascent of Forensic Social Science

  • Daniel A. McFarland
  • Kevin Lewis
  • Amir Goldberg


The rise of big data—data that are not only large and massively multivariate but concern a dizzying array of phenomena—represents a watershed moment for the social sciences. These data have created demand for new methods that reduce/simplify the dimensionality of data, identify novel patterns and relations, and predict outcomes, from computational ethnography and computational linguistics to network science, machine learning, and in situ experiments. Such developments have led scholars to begin new lines of social inquiry. Company engineers, computer scientists, and social scientists have all converged on big data, creating the possibility of a vibrant “trading zone” for collaboration. However, strong differences in research frameworks help explain why big data may not be an egalitarian trading zone across fields, but rather—at least in the short term—a moment when engineering colonizes sociology more than vice versa. In the long term, however, we suggest there may be the possibility of a constructive synthesis across paradigms in what we term ‘forensic social science.’


Big data Computational social science Sociology of science Forensic social science 


  1. Abbott, A. (1988). Transcending general linear reality. Sociological Theory, 6(2), 169–86.CrossRefGoogle Scholar
  2. Agresti, A., & Finlay, B. (2009). Statistical methods for the social sciences (4th ed.). Upper Saddle River: Prentice Hall.Google Scholar
  3. Alpaydin, E. (2004). Introduction to machine learning. Cambridge: MIT Press.Google Scholar
  4. Anand, G. (2010). A weird way of thinking has prevailed worldwide. New York Times (August 25, 2010).Google Scholar
  5. Anderson, M. J. (1988). The American census: a social history. New York: Yale University Press.Google Scholar
  6. Anderson, A., McFarland, D. A., & Jurafsky, D. (2012). Towards a computational history of the ACL: 1980–2008. Association of Computational Linguistics, Workshop (ACL Workshop 2012).Google Scholar
  7. Backstrom, L., Kleinberg, J., Lee, L., & Danescu-Niculescu-Mizil, C. (2013). Characterizing and curating conversation threads: expansion, focus, volume, re-entry. Proceedings of WSDM, 2013.Google Scholar
  8. Bail, C. A. (2014). The cultural environment: measuring culture with big data. Theory and Society, 43, 465–482.CrossRefGoogle Scholar
  9. Barabasi, A. (2003). Linked: How everything is connected to everything else and what it means for business, science, and everyday life. New York: Plume.Google Scholar
  10. Bender-deMoll, S., & McFarland, D. A. (2006). The art and science of dynamic network visualization. Journal of Social Structure, 7(2).Google Scholar
  11. Berger, P., & Luckmann, T. (1966). The social construction of reality: a treatise in the sociology of knowledge. New York: Anchor.Google Scholar
  12. Bishop, C. (2007). Pattern recognition and machine learning (information science and statistics). Cambridge: Springer.Google Scholar
  13. Blei, D. (2012). Probabilistic topic models. Review article, Communication of the ACM, 55(4), 77–84.CrossRefGoogle Scholar
  14. Bohn, A., Buchta, C., Hornik, K., & Mair, P. (2014). Making friends and communicating on facebook: implications for the access to social capital. Social Networks, 37, 29–41.CrossRefGoogle Scholar
  15. Borgatti, S. P., Mehra, A., Brass, D. J., & Labianca, G. (2009). Network analysis in the social sciences. Science, 323, 892–95.CrossRefGoogle Scholar
  16. Boyd, D., & Crawford, K. (2012). Critical questions for big data: provocations for a cultural, technological, and scholarly phenomenon. Information, Communication & Society, 15, 662–79.CrossRefGoogle Scholar
  17. Brandes, U., Robins, G., McCranie, A., & Wasserman, S. (2013). What is network science? Network Science, 1, 1–15.CrossRefGoogle Scholar
  18. Brown, J. S., & Duguid, P. (2002). The social life of information. Harvard Business Review PressGoogle Scholar
  19. Bruch, E. E., & Mare, R. D. (2012). Methodological issues in the analysis of residential preferences and residential mobility. Sociological Methodology, 42, 103–54.CrossRefGoogle Scholar
  20. Camic, C., & Xie, Y. (1994). The statistical turn in American social science: Columbia University, 1890 to 1915. American Sociological Review, 59(5), 773–805.CrossRefGoogle Scholar
  21. Castells, M., Fernández-Ardèvol, M., Qiu, J. L., & Sey, A. (2007). Mobile communication and society: a global perspective. Cambridge: MIT Press.Google Scholar
  22. Centola, D. (2010). The spread of behavior in an online social network experiment. Science, 329(5996), 1194–97.CrossRefGoogle Scholar
  23. Coleman, J. S. (1986). Social theory, social research, and a theory of action. American Journal of Sociology, 91(6), 1309–35.CrossRefGoogle Scholar
  24. Coleman, J. S. (1994a). Foundations of social theory. Cambridge: Belknap Press.Google Scholar
  25. Coleman, J. S. (1994b). A vision for sociology. Society, 30, 29–34.CrossRefGoogle Scholar
  26. Collins, H., Evans, R., & Gorman, M. (2007). Trading zones and interactional expertise. Studies in History and Philosophy of Science, 38(4), 657–66.CrossRefGoogle Scholar
  27. Converse, J. M. (1987). Survey research in the United States: roots and emergence 1890–1960. Berkeley: University of California Press.Google Scholar
  28. Cukier, K., & Mayer-Schoenberge, V. (2013). The rise of big data: how it’s changing the way we think about the world. Foreign Affairs, 28–41.Google Scholar
  29. Diehl, D., & McFarland, D. A. (2010). Towards a historical sociology of situations. American Journal of Sociology, 115(6), 1713–52.CrossRefGoogle Scholar
  30. Dodds, P. S., Muhamad, R., & Watts, D. (2003). An experimental study of search in global social networks. Science, 301(5634), 827–9.CrossRefGoogle Scholar
  31. Easley, D., & Kleinberg, J. (2010). Networks, crowds, and markets: reasoning about a highly connected world. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  32. Einav, L., Levin, J., Popov, I., & Sundaresan, N. (2014). Growth, adoption and use of mobile e-commerce. American Economic Review: Papers and Proceedings, 104(5), 489–94.CrossRefGoogle Scholar
  33. Fleck, L. (1979). Genesis and development of a scientific fact. Chicago: University of Chicago Press.Google Scholar
  34. Galison, P. (1997). Image and logic: a material culture of microphysics. Chicago: University of Chicago Press.Google Scholar
  35. Glaser, B. G., & Strauss, A. L. (1967). The discovery of grounded theory: strategies for qualitative research. Chicago: Aldine Pub. Co.Google Scholar
  36. Goldberg, A. (in press). In defense of forensic social science. Big Data & Society.Google Scholar
  37. Golder, S. A., & Macy, M. W. (2011). Diurnal and seasonal mood vary with work, sleep and daylength across diverse cultures. Science, 333(6051), 1878–81.CrossRefGoogle Scholar
  38. Golder, S. A., & Macy, M. W. (2014). Digital footprints: opportunities and challenges for online social research. Annual Review of Sociology, 40, 129–52.CrossRefGoogle Scholar
  39. González-Bailón, S., Borge-Holthoeter, J., Rivero, A., & Moreno, Y. (2011). The dynamics of protest recruitment through an online network. Scientific Reports, 1, 197.Google Scholar
  40. González-Bailón, S., Wang, N., Rivero, A., Borge-Holthoefer, J., & Moreno, Y. (2014). Assessing the bias in samples of large online networks. Social Networks, 38, 16–27.CrossRefGoogle Scholar
  41. Grimmer, J., Westwood, S. J., & Messing, S. (2014). The impression of influence: legislator communication, representation, and democratic accountability. Princeton: Princeton University Press.CrossRefGoogle Scholar
  42. Hacking, I. (2006). The emergence of probability. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  43. Hilbert, M., & López, P. (2011). The world’s technological capacity to store, communicate, and compute information. Science, 332(6025), 60–5.CrossRefGoogle Scholar
  44. Jurafsky, D., & Martin, J. H. (2009). Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition. New York: Prentice Hall.Google Scholar
  45. Kagan, J. (2009). The three cultures: natural sciences, social sciences, and the humanities in the 21st century. New York: Cambridge University Press.CrossRefGoogle Scholar
  46. Kirchner, C., & Mohr, J. W. (2010). “Meanings and relations: an introduction to the study of language, discourse, and networks.”. Poetics, 38(6), 555–66.CrossRefGoogle Scholar
  47. Kohavi, R., & Longbotham, R. (2007). Online experiments: lessons learned. Computer, 40(9), 103–5.CrossRefGoogle Scholar
  48. Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30–37.CrossRefGoogle Scholar
  49. Kuhn, T. S. (1996). The structure of scientific revolutions (3rd ed.). Chicago: University of Chicago Press.CrossRefGoogle Scholar
  50. Latour, B. (1988). Science in action. Cambirdge: Harvard University Press.Google Scholar
  51. Latour, B., & Woolgar, S. (1986). Laboratory life: the construction of scientific facts (2nd ed.). Princeton: Princeton University Press.Google Scholar
  52. Laumann, E. O., Marsden, P., & Prensky, D. (1983). “The boundary specification problem in network analysis.”. In R. S. Burt & M. J. Minor (Eds.), Applied network analysis: A methodological introduction. London: Sage Publications.Google Scholar
  53. Lazer, D., Pentland, A., Adamic, L., Aral, S., Barabási, A., Brewer, D., Christakis, N., Contractor, N., Fowler, J., Gutmann, M., Jebara, T., King, G., Macy, M., Roy, D., & Alstyne, M. V. (2009). Computational social science. Science, 323(5915), 721–3.CrossRefGoogle Scholar
  54. Leskovec, J., & Horvitz, E. (2008). Planetary-scale views on a large instant-messaging network. International World Wide Web Conference (WWW).Google Scholar
  55. Leskovec, J., Lang, K., & Mahoney, M. (2010). Empirical comparison of algorithms for network community detection. In WWW ’10: Proceedings of the 19th International Conference on World Wide Web. New York: ACM.Google Scholar
  56. Levine, D. N. (1995). Visions of the sociological tradition. Chicago: University of Chicago Press.Google Scholar
  57. Lewis, K. (2015). Studying online behavior: comment on Anderson et al. 2014. Sociological Science, 2, 20–31.CrossRefGoogle Scholar
  58. Lewis, K. (in press). Three fallacies of digital footprints. Big Data & Society.Google Scholar
  59. Lewis, K., Kaufman, J., Gonzalez, M., Wimmer, A., & Christakis, N. (2008). Tastes, ties, and time: a new social network dataset using Social Networks, 30(4), 330–42.CrossRefGoogle Scholar
  60. Lohr, S. (2012). “The Age of Big Data.” New York Times (February 11, 2012)Google Scholar
  61. Manning, C. D., & Schütze, H. (1999). Foundations of statistical natural language processing. Cambridge: MIT Press.Google Scholar
  62. McCallum, A., Corrada-Emmanuel, A., & Wang, X. (2005). Topic and role discovery in social networks. IJCAI (International Joint Conferences on Artificial Intelligence).Google Scholar
  63. McFarland, D.A. and H.R. McFarland. (in press). Big data and the danger of being precisely inaccurate. Big Data & Society.Google Scholar
  64. McFarland, D. A., Diehl, D., & Rawlings, C. (2011). “Methodological transactionalism and the sociology of education.”. In H. Maureen (Ed.), Chapter 5 in Frontiers in sociology of education (pp. 87–109). New York: Springer.CrossRefGoogle Scholar
  65. McFarland, D. A., Manning, C. D., Ramage, D., Chuang, J., Heer, J., & Jurafsky, D. (2013a). Differentiating language usage through topic models. Poetics, 41(6), 607–25.CrossRefGoogle Scholar
  66. McFarland, D. A., Jurafsky, D., & Rawlings, C. (2013b). Making the connection: social bonding in courtship situations. American Journal of Sociology, 118(6), 1596–1649.CrossRefGoogle Scholar
  67. Menand, L. (2010). The marketplace of ideas: issues of our time. New York: W.W. Norton & Company.Google Scholar
  68. National Research Council. (2014). Convergence: Facilitating transdisciplinary integration of life sciences, physical sciences, engineering and beyond. National Research Council.Google Scholar
  69. Newman, M. E. J. (2001). The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences, 98, 404–409.CrossRefGoogle Scholar
  70. Newman, M. E. J. (2009). Networks: an introduction. Oxford: Oxford University Press.Google Scholar
  71. Newman, M. E. J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69, 026113.CrossRefGoogle Scholar
  72. Pentland, A. (2014). Social physics: How good ideas spread--the lessons from a new science. New York: Penguin Press.Google Scholar
  73. Platt, J. (1996). A history of sociological research methods in America, 1920–1960. Cambridge: Cambridge University Press.Google Scholar
  74. Porter, T. M. (1995). Trust in numbers. Princeton: Princeton University Press.Google Scholar
  75. Porter, T. M., & Ross, D. (Eds.). (2003). The modern social sciences. New York: Cambridge University Press.Google Scholar
  76. Ranganath, R., Jurafsky, D., & McFarland, D. A. (2012). Detecting friendly, flirtatious, awkward, and assertive speech in speed-dates. Computer Speech and Language, 27(1), 89–115.CrossRefGoogle Scholar
  77. Rogers, E. M. (1987). Progress, problems and prospects for network research: investigating relationships in the age of electronic communication technologies. Social Networks, 9, 285–310.CrossRefGoogle Scholar
  78. Rosenfeld, M. J., & Thomas, R. J. (2012). Searching for a mate: the rise of the internet as a social intermediary. American Sociological Review, 77(4), 523–47.CrossRefGoogle Scholar
  79. Salganik, M. J., Dodds, P. S., & Watts, D. J. (2006). Experimental study of inequality and unpredictability in an artificial cultural market. Science, 311, 854–6.CrossRefGoogle Scholar
  80. Shi, X., Leskovec, J., & McFarland, D. A. (2010). Citing for high impact. Joint Conference on Digital Libraries, (JCDL 2010).Google Scholar
  81. Shwed, U., & Bearman, P. S. (2010). The temporal structure of scientific consensus formation. American Sociological Review, 75(6), 817–40.CrossRefGoogle Scholar
  82. Smith, A., & Duggan, M. (2013). Online dating & relationships. Washington: Pew Research Center.Google Scholar
  83. Snow, C. P. (2001). The two cultures. London: Cambridge University Press. 1959.Google Scholar
  84. Sparrow, B., Liu, J., & Wegner, D. M. (2011). Google effects on memory: cognitive consequences of having information at our fingertips. Science, 333, 776–8.CrossRefGoogle Scholar
  85. Stokes, D. E. (1997). Pasteur’s quadrant: basic science and technological innovation. Washington: Brookings Institution Press.Google Scholar
  86. Stouffer, S. A. (1949). In The American Soldier, 4 vols Studies in social psychology during World War II.. Princeton, NJ: Princeton University Press.Google Scholar
  87. Szell, M., & Thurner, S. (2010). Measuring social dynamics in a massive multiplayer online game. Social Networks, 32, 313–29.CrossRefGoogle Scholar
  88. Talley, E., Newman, D., Herr, B., II, Wallach, H., Burns, G., Leenders, M., & McCallum, A. (2011). A database of national institutes of health (NIH) research using machine learned categories and graphically clustered grant awards. Nature Methods, 8, 443–4.CrossRefGoogle Scholar
  89. Vaisey, S. (2009). Motivation and justification: a dual-process model of culture in action. American Journal of Sociology, 114, 1675–1715.CrossRefGoogle Scholar
  90. Vaughan, D. (2014). Analogy, cases, and comparative social organization. In R. Swedberg (Ed.), Theorizing in social science: the context of discovery (pp. 61–84). Stanford: Stanford University Press.Google Scholar
  91. Wang, D. J., Shi, X., McFarland, D. A., & Leskovec, J. (2012). Measurement error in social network data: a re-classification. Social Networks, 34(4), 396–409.CrossRefGoogle Scholar
  92. Wasserman, S., & Faust, K. (1994). Social network analysis: methods and applications. Cambridge: Cambridge University Press.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Daniel A. McFarland
    • 1
  • Kevin Lewis
    • 2
  • Amir Goldberg
    • 1
  1. 1.Stanford UniversityStanfordUSA
  2. 2.University of CaliforniaSan DiegoUSA

Personalised recommendations