Blended Data: Critiquing and Complementing Social Media Datasets, Big and Small

Living reference work entry


Internet research, and especially social media research, has benefited from concurrent factors, technological and analytical, that have enabled access to vast amounts of user data and content online. These trends have accompanied a prevalence of Big Data studies of online activity, as researchers gather datasets featuring millions of tweets, for instance – here, Big Data is a reference not solely to the size of datasets but to the wider practices and research cultures around large-scale and exhaustive (and often ongoing) capture of data from large groups, often (but not always) studied quantitatively (see Kitchin and Lauriaut 2014a; Crawford et al. 2014). However, the accessibility of “big social data” (Manovich 2012) for Internet studies research is not without its limitations and challenges, and while extensive datasets enable valuable research, combining them with small data can provide more rounded perspectives and encourage us to think more about what we are studying. Similarly, privileging the online-only or the quantitative analysis of social media activity may overlook or mask key practices and relevant participants not present within the datasets. We argue for a blended data model as a critique and complement for different social media datasets, drawing in part on our research into social movements and activists’ use (and non-use) of online technologies. Together, these approaches may overcome and negotiate the respective limits and challenges of social media data, both big and small.


Social media Big Data Ethics Methods Social movements 


  1. Ananny M, Crawford K (2016) Seeing without knowing: limitations of the transparency ideal and its application to algorithmic accountability. New Media Soc.
  2. Baym NK (2013) Data not seen: the uses and shortcomings of social media metrics. First Monday 18(10). Retrieved from
  3. boyd d, Crawford K (2012) Critical questions for Big Data. Info Comm Soc 15(5):662–679.
  4. Brock A (2012) From the blackhand side: Twitter as a cultural conversation. J Broadcast Electron Media 56(4):529–549. Scholar
  5. Brock A (2015) Deeper data: a response to Boyd and Crawford. Media. Culture Society 37(7):1084–1088Google Scholar
  6. Brown LA, Strega S (2005) Research as resistance critical, indigenous and anti-oppressive approaches. Canadian Scholars’ Press, Toronto. Retrieved from Scholar
  7. Bucher T (2012) Want to be on the top? Algorithmic power and the threat of invisibility on Facebook. New Media Soc 14(7):1164–1180CrossRefGoogle Scholar
  8. Burgess J, Galloway A, Sauter T (2015) Hashtag as hybrid forum: the case of #agchatoz. In: Rambukkana N (ed) Hashtag publics. Peter Lang, New York, NY, pp 61–76Google Scholar
  9. Burgess J, Matamoros Fernández A (2016) Mapping sociocultural controversies across digital media platforms: one week of #gamergate on Twitter, YouTube, and Tumblr. Comm Res Pract 2(1):79–96CrossRefGoogle Scholar
  10. Chan A (2015) Big data interfaces and the problem of inclusion. Media Cult Soc 0163443715594106.
  11. Chesters G (2012) Social movements and the ethics of knowledge production. Soc Mov Stud 11(2):145–160.
  12. Chief Elk L (2014) Teach-In Summer Fundraiser. YouCaring. Retrieved 3 June 2015, from
  13. Cho S, Crenshaw KW, McCall L (2013) Toward a field of intersectionality studies: theory, applications, and praxis. Signs 38(4):785–810.
  14. Collins PH (1989) The social construction of black feminist thought. Signs 14(4):745–773CrossRefGoogle Scholar
  15. Consalvo M (2012) Confronting toxic gamer culture: a challenge for feminist game studies scholars. J Gender New Media Technol 1. Retrieved from
  16. Crawford K, Miltner K, Gray ML (2014) Critiquing Big Data: politics, ethics, epistemology. Int J Comm 8:1663–1672Google Scholar
  17. Croeser S (2015) Global justice and the politics of information: the struggle over knowledge. Routledge, Hoboken, NJGoogle Scholar
  18. Croeser S, Highfield T (2014) Occupy Oakland and #oo: uses of Twitter within the occupy movement. First Monday 19(3).
  19. Croeser S, Highfield T (2015a) Harbouring dissent: Greek independent and social media and the antifascist movement. Fibreculture 26:136–157Google Scholar
  20. Croeser S, Highfield T (2015b) Mapping movements - social movement research and Big Data: critiques and alternatives. In: Langlois G, Redden J, Elmer G (eds) Compromised data: from social media to Big Data. Bloomsbury, pp 173–201Google Scholar
  21. Driscoll K, Thorson K (2015) Searching and clustering methodologies: connecting political communication content across platforms. Ann Am Acad Pol Soc Sci 659(1):134–148. Scholar
  22. Duguay S (forthcoming) Social media’s breaking news: the logic of automation in Facebook trending topics and twitter momentsGoogle Scholar
  23. Freelon D, McIlwain CD, Clark MD (2016) Beyond the hashtags: #Ferguson, #Blacklivesmatter, and the online struggle for offline justice. Centre for Social Media & Social Impact, Washington, DC. Scholar
  24. Gillespie T (2010) The politics of “platforms”. New Media Soc 12(3):347–364CrossRefGoogle Scholar
  25. Gillespie T (2014) The relevance of algorithms. In: Gillespie T, Boczkowski PJ, Foot KA (eds) Media technologies: essays on communication, materiality, and society. The MIT Press, Cambridge, MA, pp 167–194Google Scholar
  26. González-Bailón S, Wang N, Rivero A, Borge-Holthoefer J, Moreno Y (2014) Assessing the bias in samples of large online networks. Soc Networks 38:16–27. Scholar
  27. Harry S (2014, October 6) Everyone Watches, Nobody Sees: How Black Women Disrupt Surveillance Theory. Model View Culture. Retrieved from
  28. Hesse-Biber S, Johnson RB (2013) Coming at things differently: future directions of possible engagement with mixed methods research. J Mixed Methods Res 7(2):103–109.
  29. Highfield T, Leaver T (2015) A methodology for mapping Instagram hashtags. First Monday (1):20Google Scholar
  30. Highfield T, Leaver T (2016) Instagrammatics and digital methods: studying visual social media, from selfies and GIFs to memes and emoji. Comm Res Pract 2(1):47–62CrossRefGoogle Scholar
  31. Hoffman AL (2014, June 30) Reckoning with a decade of breaking things. Model View Culture. Retrieved from
  32. hooks b (2000) Feminist theory: from margin to Center. Pluto Press, LondonGoogle Scholar
  33. Humphreys L, Gill P, Krishnamurthy B, Newbury E (2013) Historicizing new media: a content analysis of twitter. J Commun 63(3):413–431CrossRefGoogle Scholar
  34. Johnson RB, Onwuegbuzie AJ, Turner LA (2007) Toward a definition of mixed methods research. Journal of Mixed Methods Research 1(2):112–133.
  35. Kim D (2014, October 7) Social media and academic surveillance: the ethics of digital bodies. Model View Culture. Retrieved from
  36. Kitchin R (2014) Big Data, new epistemologies and paradigm shifts. Big Data Soc 1(1):2053951714528481.
  37. Kitchin R, Lauriault TP (2014a) Small Data, Data Infrastructures and Big Data (SSRN Scholarly Paper No. ID 2376148). Rochester, NY: Social Science Research Network. Retrieved from
  38. Kitchin R, Lauriault TP (2014b) Towards critical data studies: charting and unpacking data assemblages and their work (SSRN Scholarly Paper No. ID 2474112). Rochester, NY: Social Science Research Network. Retrieved from
  39. Kim D, Kim E (2014, April 7) The #TwitterEthics manifesto. Model View Culture. Retrieved from
  40. Krikorian R (2014) Introducing Twitter Data Grants. Retrieved 4 Aug 2015, from
  41. Langlois G, Elmer G (2013) The research politics of social media platforms. Culture machine (14)Google Scholar
  42. Mahrt M, Scharkow M (2013) The value of Big Data in digital media research. J Broadcast Electron Media 57(1):20–33. Scholar
  43. Manovich L (2012) Trending: the promises and the challenges of Big Social Data. In: Gold MK (ed) Debates in the digital humanities. University of Minnesota Press, Minneapolis, pp 460–475CrossRefGoogle Scholar
  44. Marwick A, Caplan R (2017) Media manipulation and disinformation online. Data Soc, New York City. Scholar
  45. Massanari A (2017) #Gamergate and the Fappening: how Reddit’s algorithm, governance, and culture support toxic technocultures. New Media Soc 19(3):329–346CrossRefGoogle Scholar
  46. Matamoros Fernández A (2017) Platformed racism: the mediation and circulation of an Australian race-based controversy on Twitter, Facebook and YouTube. Info Comm Soc 20(6):930–946CrossRefGoogle Scholar
  47. McElroy K (2015) Gold medals, black twitter, and not-so-good hair: framing the gabby Douglas controversy. ISOJ 1(1). Retrieved from
  48. Moe H (2010) Everyone a pamphleteer? Reconsidering comparisons of mediated public participation in the print age and the digital era. Media Cult Soc 32(4):691–700CrossRefGoogle Scholar
  49. Morstatter F, Pfeffer J, Liu H, Carley KM (2013) Is the sample good enough? Comparing data from Twitter’s Streaming API with Twitter’s Firehose. In: Proceedings of the 7th International AAAI Conference on Weblogs and Social Media, pp 400–408Google Scholar
  50. Noble SU (forthcoming) Algorithms of oppression: how search engines reinforce racism. NYU Press, New York CityGoogle Scholar
  51. Papacharissi Z (2015) The unbearable lightness of information and the impossible gravitas of knowledge: Big Data and the makings of a digital orality. Media Cult Soc 0163443715594103.
  52. Puschmann C, Burgess J (2014) Metaphors of Big Data. Int J Comm 8:1690–1709Google Scholar
  53. Qiu JL (2015) Reflections on Big Data: “just because it is accessible does not make it ethical”. Media Cult Soc 0163443715594104.
  54. Ramsey DX (2015, April 10) The truth about Black Twitter. The Atlantic.
  55. Raynes-Goldie K (2012) Privacy in the age of facebook : discourse, architecture, consequences. Curtin Universitye. Retrieved from
  56. Reger J (2001) Emotions, objectivity and voice: an analysis of a “failed” participant observation. Women’s Stud Int Forum 24(5):605–616.
  57. Rentschler CA (2014) Rape culture and the feminist politics of social media. Girlhood Studies 7(1):65–82. Scholar
  58. Sawyer S (2008) Data wealth, data poverty, science and cyberinfrastructure. Prometheus 26(4):355–371.
  59. Steinhauer J (2014, July 28) Native activist charges art students with plagiarism. Hyperallergic. Retrieved 3 June 2015, from
  60. Tufekci Z (2014) Big Questions for social media Big Data: Representativeness, validity and other methodological pitfalls. In: ICWSM ‘14: Proceedings of the 8th International AAAI Conference on Weblogs and Social Media. Ann ArborGoogle Scholar
  61. Tufekci Z (2017) Twitter and tear gas: the power and fragility of networked protest. Yale University Press, New Haven, USGoogle Scholar
  62. van Dijck J, Poell T (2013) Understanding social media logic. Media Comm 1(1):2–14CrossRefGoogle Scholar
  63. Vis F (2013) A critical reflection on Big Data: considering APIs, researchers and tools as data makers. First Monday 18(10). Retrieved from
  64. Weller K, Kinder-Kurlanda KE (2015) Uncovering the challenges in collection, sharing and documentation: The hidden data of social media research? Ninth International AAAI Conference on Web and Social MediaGoogle Scholar
  65. Young S (2012) I identify as a disabled person. Retrieved 27 Aug 2015, from
  66. Zimmer M, Proferes N (2014) A topology of Twitter research: disciplines, methods, and ethics. Aslib J Manag 66(3):250–261Google Scholar

Authors and Affiliations

  1. 1.Curtin UniversityPerthAustralia
  2. 2.University of AmsterdamAmsterdamNetherlands

Personalised recommendations