Analyzing Big Data
This chapter looks into how big data and data science methods can be used to support law and policy research with empirical evidence on digital media production and consumption. To this end we analyze two cases. The simple case concerns the automatic scraping of news media websites to gather data on what is being published by news organizations. The complex case is about Robin, a research infrastructure which allows volunteers to donate their web browsing data stream so the process of personalized communications online can be studied. We discuss the issues researchers need to consider during the planning, data collection, and analysis phases of big data based research. We conclude that despite the limitations, difficulties and well-justified critique, social scientists, legal scholars, and researchers working in the humanities need to develop individual skills, and institutional competencies in big data methods, because data science is quickly becoming to be an indispensable part of the methodological tool-set of these disciplines.
- Anderson, C. (2008). The end of theory: The data deluge makes the scientific method obsolete. Wired Magazine, 16(7), 16-07.Google Scholar
- Berry, D. (2011, July 11). The computational turn: Thinking about the digital humanities. Culture Machine, 12 [online]. Available at http://www.culturemachine.net/index.php/cm/article/view/440/470.
- Bodó, B., Helberger, N., & de Vreese, C. H. (2017). Political micro-targeting: A Manchurian candidate or just a dark horse? Internet Policy Review, 6(4). Google Scholar
- Bodó, B., Helberger, N., Irion, K., Borgesius Zuiderveen, F. J., Moller, J., van der Velde, B., … de Vreese, C. H. (2017). Tackling the algorithmic control crisis—The technical, legal, and ethical challenges of research into algorithmic agents. Yale Journal of Law & Technology, 19, 133.Google Scholar
- Borgesius, F. J., Trilling, D., Möller, J., Bodó, B., de Vreese, C. H., & Helberger, N. (2016). Should we worry about filter bubbles? An interdisciplinary inquiry into self-selected and pre-selected personalised communication. Internet Policy Review, 5(1). Google Scholar
- Cadwalladr, C., & Graham-Harrison, E. (2018). How Cambridge analytica turned Facebook ‘likes’ into a lucrative political tool. The Guardian, 18. https://www.theguardian.com/technology/2018/mar/17/facebook-cambridge-analytica-kogan-data-algorithm.
- Hazen, B. T., Boone, C. A., Ezell, J. D., & Jones-Farmer, L. A. (2014). Data quality for data science, predictive analytics, and big data in supply chain management: An introduction to the problem and suggestions for research and applications. International Journal of Production Economics, 154, 72–80. https://doi.org/10.1016/j.ijpe.2014.04.018.CrossRefGoogle Scholar
- Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Byers, A. H. (2011). Big data: The next frontier for innovation, competition, and productivity. New York, NY: McKinsey & Company.Google Scholar
- Mayer-Schönberger, V., & Cukier, K. (2013). Big data: A revolution that will transform how we live, work, and think. Boston: Houghton Mifflin Harcourt.Google Scholar
- Mayer-Schönberger, V., & Cukier, K. (2014). Learning with big data: The future of education. Boston: Houghton Mifflin Harcourt. Google Scholar
- McAfee, A., Brynjolfsson, E., Davenport, T. H., Patil, D. J., & Barton, D. (2012). Big data: The management revolution. Harvard Business Review, 90(10), 60–68.Google Scholar
- Bodó, B., Helberger, N., Irion, K., Borgesius Zuiderveen, F. J., Moller, J., van der Velde, B., … de Vreese, C. H. (2017). Tackling the algorithmic control crisis—The technical, legal, and ethical challenges of research into algorithmic agents. Yale Journal of Law & Technology, 19, 133. Google Scholar
- Borgesius, F. Z., Gray, J., & Eechoud, M. V. (2015). Open data, privacy, and fair information principles: Towards a balancing framework. Berkeley Technology Law Journal, 30, 2073.Google Scholar
- Mitchell, R. (2018). Web scraping with python: Collecting more data from the modern web. Sebastopol: O’Reilly Media. Google Scholar