Bias in algorithmic filtering and personalization

Abstract

Online information intermediaries such as Facebook and Google are slowly replacing traditional media channels thereby partly becoming the gatekeepers of our society. To deal with the growing amount of information on the social web and the burden it brings on the average user, these gatekeepers recently started to introduce personalization features, algorithms that filter information per individual. In this paper we show that these online services that filter information are not merely algorithms. Humans not only affect the design of the algorithms, but they also can manually influence the filtering process even when the algorithm is operational. We further analyze filtering processes in detail, show how personalization connects to other filtering techniques, and show that both human and technical biases are present in today’s emergent gatekeepers. We use the existing literature on gatekeeping and search engine bias and provide a model of algorithmic gatekeeping.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3

Notes

  1. 1.

    For instance, Facebook uses an algorithm called Edgerank to determine how a newsfeed of a user is constructed. It is believed that several factors are used to select/prioritize user updates, such as affinity between the receiver and sender, and the date of the published update. However, the exact formula is unknown. See Techcrunch (2011).

  2. 2.

    We would like to thank the anonymous reviewers to point out this fact.

References

  1. Accuracast. (2010). Facebook advertising policies homophobic. May. http://searchdailynews.blogspot.com/2010/05/facebook-advertising-policies.html.

  2. Adomavicius, G., Sankaranarayanan, R., Sen, S., & Tuzhilin, A. (2005). Incorporating contextual information in recommender systems using a multidimensional approach. ACM Transactions on Information Systems (TOIS), 23(1), 103–145.

    Article  Google Scholar 

  3. Agada, J. (1999). Inner-city gatekeepers: An exploratory survey of their information use environment. Journal of the American Society for Information Science, 50(1), 74–85. http://www.eric.ed.gov/ERICWebPortal/detail?accno=EJ582286.

    Google Scholar 

  4. Albanesius, C. (2011). Schmidt, yelp clash over google’s search tactics. PCMAG. http://www.pcmag.com/article2/0,2817,2393369,00.asp.

  5. Althaus, S. L., & Tewksbury, D. (2002). Agenda setting and the ‘new’ news. Communication Research, 29(2), 180.

    Article  Google Scholar 

  6. Badash, D. (2011). Has facebook censorship gone too far? The New Civil Rights Movement. http://thenewcivilrightsmovement.com/has-facebook-censorship-gone-too-far/politics/2011/11/07/29714.

  7. Bagdikian, B. H. (2004). The New media monopoly: A completely revised and updated edition with seven new chapters. Beacon Press, May. http://www.amazon.com/dp/0807061875.

  8. Bakshy, E., Rosenn, I., Marlow, C., & Adamic, L. (2012). The role of social networks in information diffusion. In Proceedings of the 21st international conference on World Wide Web (WWW '12) (pp. 519–528). New York, NY, USA: ACM. doi:10.1145/2187836.2187907. http://doi.acm.org/10.1145/2187836.2187907.

  9. Bar-Ilan, J., Keenoy, K., Levene, M., & Yaari, E. (2009). Presentation bias is significant in determining user preference for search results—A user study. Journal of the American Society for Information Science and Technology, 60(1), 135–149.

    Article  Google Scholar 

  10. Barzilai-Nahon, K. (2008). Toward a theory of network gatekeeping: A framework for exploring information control. Journal of the American Society for Information Science and Technology, 59(9), 1493–1512.

    Article  Google Scholar 

  11. Barzilai-Nahon, K. (2009). Gatekeeping: A critical review. Annual Review of Information Science and Technology, 43(1), 1–79.

    Article  Google Scholar 

  12. Benkler, Y. (2006). The wealth of networks: How social production transforms markets and freedom. New Haven: Yale University Press.

    Google Scholar 

  13. Bruns, A. (2008). Gatewatching, gatecrashing: Futures for tactical news media. In M. Boler (Ed.), Digital media and democracy: Tactics in hard times (pp. 247–271). MIT Press. http://www.amazon.com/dp/0262026422.

  14. Bruns, A. (2011). Gatekeeping, gatewatching, real-time feedback. Brazilian Journalism Research, 7, 117–136

    Google Scholar 

  15. Chatman, E. A. (1987). Opinion leadership, poverty, and information sharing. RQ, 26(3), 53–341. http://www.eric.ed.gov/ERICWebPortal/detail?accno=EJ354348.

  16. Chen, C. C., & Hernon, P. (1982). Information seeking: Assessing and anticipating user needs. Neal-Schuman Publishers. http://books.google.nl/books?id=_6fgAAAAMAAJ.

  17. Chibnall S. (1975}. The crime reporter: A study in the production of commercial knowledge. Sociology, 9(1), 49–66.

    Google Scholar 

  18. Chilling effects. (2005). Scientology complains that advanced technology appears in Google groups. http://www.chillingeffects.org/dmca512/notice.cgi?NoticeID=2355.

  19. Christman, J. (2011). Autonomy in moral and political philosophy. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy. Stanford, CA: CSLI, Stanford University.

  20. Cisco. (2011). Cisco visual networking index: Forecast and methodology, whitepaper. http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/white_paper_c11-481360_ns827_Networking_Solutions_White_Paper.html.

  21. Costolo, D. (2011). The trends are algorithmic, not chosen by us but we edit out any w/obscenities. July. https://twitter.com/#!/dickc/status/97686216681594880.

  22. Cuilenburg, V. (1999). On competition, access and diversity in media, old and new some remarks for communications policy in the information age. New Media & Society, 1(2), 183–207.

    Article  Google Scholar 

  23. Dekker, V. (2006). Google: Een zwijgzame rechter en politieagent. Trouw.

  24. Diaz, A. (2008). Through the Google goggles: Sociopolitical bias in search engine design. In S. Amanda., & Z. Michael (Eds.), Information science and knowledge management (Vol. 14, pp. 11–34). Berlin Heidelberg: Springer.

  25. Durrance, J. C. (1984). Armed for action library response to citizen information needs. New York, NY: Neal Schuman.

    Google Scholar 

  26. Edelman, B. (2011). Bias in search results: Diagnosis and response. Indian Journal of Law and Technology, 7, 16.

    Google Scholar 

  27. Efrati, A. (2010). Rivals say Google plays favourites. Wall Street Journal, December. http://online.wsj.com/article/SB10001424052748704058704576015630188568972.html.

  28. Elgan, M. (2011). How to pop your Internet ‘filter bubble’. Computerworld. http://www.computerworld.com/s/article/9216484/Elgan_How_to_pop_your_Internet_filter_bubble_.

  29. Eppler, M. J., & Mengis, J. (2004). The concept of information overload: A review of literature from organization science, accounting, marketing, mis, and related disciplines. The Information Society, 20(5), 325–344.

    Article  Google Scholar 

  30. Fallows, D. (2005). Search engine users. http://www.pewinternet.org/Reports/2005/Search-Engine-Users/8-Conclusions/Conclusions.aspx.

  31. Fishman, M. (1988). Manufacturing the news. Austin: University of Texas Press.

    Google Scholar 

  32. Flanagan, M., Howe, D., & Nissenbaum, H. (2008) Embodying values in technology: Theory and practice. In J. van den Hoven & J. Weckert (Eds.), Information technology and moral philosophy (pp. 322–353). Cambridge: Cambridge University Press.

  33. Fong, J. (2011). Facebook’s bias against 3rd party apps. http://www.jenfongspeaks.com/facebooks-bias-against-3rd-party-apps/.

  34. Foundem. 2009. Foundem’s Google story.

  35. Friedman, B., Kahn, P. H., & Alan, B. (2006). Value sensitive design and information systems. Human-Computer Interaction in Management Information Systems: Foundations, 4, 348–372.

    Google Scholar 

  36. Friedman, B., & Nissenbaum, H. (1996). Bias in computer systems. ACM Transactions on Information Systems, 14(3), 330–347.

    Article  Google Scholar 

  37. Friedman, B., & Nissenbaum, H. (1997). Software agents and user autonomy. In Proceedings of the first international conference on autonomous agentsAGENTS’97, pp. 466–469.

  38. Gans, H. J. (2005). Deciding what’s news: A study of CBS evening news, NBC nightly news, newsweek, and time (2nd ed.). Evanston: Northwestern University Press.

    Google Scholar 

  39. Garcia-Molina, H., Koutrika, G., & Parameswaran, A. (2011). Information seeking. Communications of the ACM, 54(11), 121. doi:10.1145/2018396.2018423.

    Article  Google Scholar 

  40. Gauch, S., Speretta, M., Chandramouli, A., & Micarelli, A. (2007). User profiles for personalized information access. The adaptive web (pp. 54–89). Berlin Heidelberg: Springer.

    Google Scholar 

  41. Gillespie, T. (2012). Can an algorithm be wrong? Limn (2). http://limn.it/can-an-algorithm-be-wrong/.

  42. Goldman, E. (2005). Search engine bias and the demise of search engine utopianism. Yale JL & Technology, 8, 188.

    Google Scholar 

  43. Goldman, E. (2011). Revisiting search engine bias chapter in (Contemporary Issues in Cyberlaw), William Mitchell Law Review, 38, 96–110.

    Google Scholar 

  44. Google. (2008). We knew the web was big. http://googleblog.blogspot.com/2008/07/we-knew-web-was-big.html.

  45. Google. (2011). Transparency report. http://www.google.com/transparencyreport/governmentrequests/.

  46. Google. (2012). Search plus your world: Personal results. http://support.google.com/websearch/bin/answer.py?hl=en&answer=1710607.

  47. Granka, L. A. (2010). The politics of search: A decade retrospective. The Information Society, 26(5), 364–374. doi:10.1080/01972243.2010.511560.

    Article  Google Scholar 

  48. Granovetter, M. S. (1981). The strength of weak ties: a network theory revisited. State University of New York, Department of Sociology.

  49. Groot, J. (2004). Trouw wekenlang niet te vinden op Google. Webwereld.

  50. Guha, S., Cheng, B., & Francis, P. (2010). Challenges in measuring online advertising systems. In Proceedings of the 10th ACM SIGCOMM conference on Internet measurement (IMC '10) (pp. 81–87). New York, NY, USA: ACM. doi:10.1145/1879141.1879152. http://doi.acm.org/10.1145/1879141.1879152.

  51. Helberger, N. (2011). Diversity by design. Journal of Information Policy, 1, 441–469.

    Google Scholar 

  52. Hermida, A. (2012). Tweets and truth: Journalism as a discipline of collaborative verification. Journalism Practice, 6(5-6), 659–668.

    Article  Google Scholar 

  53. Hilbert, M. (2012). Toward a synthesis of cognitive biases: How noisy information processing can bias human decision making. Psychological Bulletin, 138(2), 211–237.

    MathSciNet  Article  Google Scholar 

  54. Hitwise. (2010). Social networks now more popular than search engines in the UK.

  55. Hoven, J. V., & Rooksby, E. (2008). Distributive justice and the value of information: A (broadly) Rawlsian approach. England: Cambridge University Press.

    Google Scholar 

  56. IBM. (2011). Bringing smarter computing to big data.

  57. Ingram, M. (2011). The downside of facebook as a public space: Censorship. June. http://gigaom.com/2011/06/21/the-downside-of-facebook-as-a-public-space-censorship/.

  58. Jacobs, G. (2010). Techradar: How to optimise your site for Google Caffeine. Techradar.com, April. http://www.techradar.com/news/internet/how-to-optimise-your-site-for-google-caffeine-685436.

  59. Joachims, T., & Radlinski, F. (2007). Search engines that learn from implicit feedback. Computer, 40(8), 34–40.

    Article  Google Scholar 

  60. Katz, E. (1996). And deliver us from segmentation. Annals of the American Academy of Political and Social Science, 546, 22–33.

    Article  Google Scholar 

  61. Katz, E., & Lazarsfeld, P. (2005). Personal influence: The part played by people in the flow of mass communications. New Jersey: Transaction Publishers.

    Google Scholar 

  62. Kincaid, J. (2010). Techcrunch/today’s lesson: Make facebook angry, and they’ll censor you into oblivion. TechCrunch. http://techcrunch.com/2010/11/22/facebook-censorship/.

  63. Klein, J. (2011). A web marketer’s guide to reddit. December. http://www.distilled.net/blog/social-media/a-web-marketers-guide-to-reddit/.

  64. Knight, W. (2012). Google hopes to make friends with a more social search: technology review. Technology Review. http://www.technologyreview.com/computing/39444/.

  65. Korolova, A. (2010). Privacy violations using microtargeted ads: A case study. In Proceedings of the IEEE international conference on data mining workshops (ICDMW '10) (pp. 474–482). Washington, DC, USA: IEEE Computer Society. doi:10.1109/ICDMW.2010.137. http://dx.doi.org/10.1109/ICDMW.2010.137.

  66. Lasorsa, D. L., Lewis, S. C., & Holton, A. (2012). Normalizing Twitter-Journalism practice in an emerging communication space. Journalism Studies, 13(1), 19–36.

    Article  Google Scholar 

  67. Lavie, T., Sela, M., Oppenheim, I., Inbar, O., & Meyer, J. (2009). User attitudes towards news content personalization. International Journal of Human-Computer Studies, 68(8), 483–495.

    Article  Google Scholar 

  68. Levinson, P. (1999). Digital McLuhan: A guide to the information millennium (1st ed.). London: Routledge.

    Google Scholar 

  69. Lotan, G. (2011). Data reveals that “occupying” twitter trending topics is harder than it looks! http://blog.socialflow.com/post/7120244374/data-reveals-that-occupying-twitter-trending-topics-is-harder-than-it-looks.

  70. Lu, Y. (2007). The human in human information acquisition: Understanding gatekeeping and proposing new directions in scholarship. Library & Information Science Research, 29(1), 103–123.

    Article  Google Scholar 

  71. Manyika, J., Chui, M., Brown, B., Buighin, J., Dobbs, R., & Roxburgh, C. (2011). Big data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute report. Whitereport. Downloadable at http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation.

  72. Metz, C. (2011a). Google opens curtain on ‘manual’ search penalties. The register. http://www.theregister.co.uk/2011/02/18/google_on_manual_search_penalties/.

  73. Metz, C. (2011b). Google contradicts own counsel in face of antitrust probe, admits existence of search algorithm whitelists. http://www.theregister.co.uk/2011/03/11/google_admits_search_algorithm_whitelists/.

  74. Morozov, E. (2011). Your own facts. Book review of ‘the filter bubble, what the internet is hiding from you’. The New York Times.

  75. Mowshowitz, A., & Kawaguchi, A. (2002). Bias on the web. Communications of the ACM, 45(9), 56–60.

    Article  Google Scholar 

  76. Munson, S. Z., & Resnick, P. (2010). Presenting diverse political opinions: How and how much CHI’10. In Proceedings of the SIGCHI conference on human factors in computing systems.

  77. Nagesh, G. (2011). Privacy advocates want facebook probed on recent changes.

  78. Napoli, P. (1999). Deconstructing the diversity principle. Journal of Communication, 49(4), 7–34.

    Article  Google Scholar 

  79. Nissenbaum, H., & Introna, L. D. (2000). Shaping the web: Why the politics of search engines matters. The Information Society, 16(3), 169–185.

    Article  Google Scholar 

  80. O’Dell, J. (2011). Facebook’s ad revenue hit $1.86B for 2010. Mashable.

  81. Opsahl, K. (2009). Google begins behavioural targeting ad program. https://www.eff.org/deeplinks/2009/03/google-begins-behavioral-targeting-ad-program.

  82. Pariser, E. (2011a). 10 ways to pop your filter bubble. http://www.thefilterbubble.com/10-things-you-can-do.

  83. Pariser, E. (2011b). The filter bubble: What the internet is hiding from you. London: Penguin Press.

    Google Scholar 

  84. Priestley, M. (1999). Honest news in the slashdot decade. First Monday, 4, 2–8.

    Article  Google Scholar 

  85. Resnick, P., Lacovou, N., Suchak, M., Bergstrom, P., & Riedl, J. (1994). GroupLens: an open architecture for collaborative filtering of netnews. In Proceedings of the 1994 ACM conference on computer supported cooperative work (CSCW '94) (pp. 175–186). New York, NY, USA: ACM. doi:10.1145/192844.192905. http://doi.acm.org/10.1145/192844.192905.

  86. Salihefendic, A. (2010). How reddit ranking algorithms work. http://amix.dk/blog/post/19588.

  87. Schroeder, S. (2011). Twitter ad revenue may reach $150 million this year. Mashable.

  88. Searchenginewatch. (2012). Twitter: Google search plus your world bad for web users. Search Engine Watch. http://searchenginewatch.com/article/2136873/Twitter-Google-Search-Plus-Your-World-Bad-for-Web-Users.

  89. Segal, D. (2011). Search optimization and its dirty little secrets. The New York Times.

  90. Shardanand, U., & Maes, P. (1995). Social information filtering: algorithms for automating “word of mouth”. In I. R. Katz, R. Mack, L. Marks, M. B Rosson & J. Nielsen (Eds.), Proceedings of the SIGCHI conference on human factors in computing systems (CHI '95) (pp. 210–217). New York, NY, USA: ACM Press/Addison-Wesley Publishing Co. doi:10.1145/223904.223931. http://dx.doi.org/10.1145/223904.223931.

  91. Shoemaker, P. J, Vos, T., & Reese, P. (2008). Journalists as gatekeepers. In K. W. Jorgensen., & T. Hanitzsch (Eds.). The handbook of journalism studies (pp. 73–87). New York: Routledge

  92. Shoemaker, P. J., & Vos, T. (2009). Gatekeeping theory (1st ed.). London: Routledge.

    Google Scholar 

  93. Slater, P. E. (1955). Role differentiation in small groups. American Sociological Review, 20(3), 300–310.

    MathSciNet  Article  Google Scholar 

  94. Smith, J., McCarthy, J. D., McPhail, C., & Augustyn, B. (2001). From protest to agenda building: Description bias in media coverage of protest events in Washington, D.C. Social Forces, 79(4), 1397–1423.

    Article  Google Scholar 

  95. Smyth, B. (2007). A community-based approach to personalizing web search. Computer, 40(8), 42–50. doi:10.1109/MC.2007.259.

    Article  Google Scholar 

  96. Soley, L. C. (2002). Censorship Inc: The corporate threat to free speech in the United States. USA: Monthly Review Press.

    Google Scholar 

  97. Sturges, P. (2001). Gatekeepers and other intermediaries. Aslib Proceedings, 53(2), 62–67.

    Article  Google Scholar 

  98. Sullivan, D. (2012). Google’s results get more personal with ‘search plus your world’. Search engine land. http://goo.gl/xYoRV.

  99. Sunstein, C. R. (2002). Republic.com. USA: Princeton University Press.

    Google Scholar 

  100. Sunstein, C. (2006). Preferences, paternalism, and liberty. Royal Institute of Philosophy Supplements, 59, 233–264.

    Article  Google Scholar 

  101. Sunstein, C. R. (2008). Infotopia: How many minds produce knowledge. USA: Oxford University Press.

    Google Scholar 

  102. Taylor, D. (2011). Everything you need to know about facebook’s edgerank. The Next Web. http://thenextweb.com/socialmedia/2011/05/09/everything-you-need-to-know-about-facebooks-edgerank/.

  103. Techcrunch. (2011). Edgerank: The secret sauce that makes facebook’s news feed tick.

  104. Tewksbury, D. (2003). What do Americans really want to know? Tracking the behavior of news readers on the internet. Journal of Communication, 53(4), 694–710.

    Article  Google Scholar 

  105. Twitter. (2010). To trend or not to trend. http://blog.twitter.com/2010/12/to-trend-or-not-to-trend.html.

  106. Tynan, D. (2012). How companies buy facebook friends, likes, and buzz. PCWorld.

  107. Upbin, B. (2011). Facebook ushers in era of new social gestures—Forbes. Forbes.

  108. US Securities and Exchange Commission. (2009). Google Inc., Consolidated Balance Sheets. http://www.sec.gov/Archives/edgar/data/1288776/000119312509150129/dex992.htm.

  109. Van Couvering, E. (2007). Is relevance relevant? Market, science, and war: discourses of search engine quality. Journal of Computer-Mediated Communication, 12(3), 866.

    Article  Google Scholar 

  110. Van der Hof, S., & Prins, C. (2008). Personalisation and its influence on identities, behaviour and social values. Profiling the European citizen: Cross-disciplinary perspectives (pp. 111–127). Netherlands: Springer.

    Google Scholar 

  111. Vaughan, L., & Thelwall, M. (2004). Search engine coverage bias: Evidence and possible causes. Information Processing and Management, 40(4), 693–707.

    Article  Google Scholar 

  112. Witten, I. A. (2007). Bias, privacy, and personalization on the web. In M. Sankara Reddy & H. Kumar (Eds.), E-libraries: Problems and perspectives. New Delhi: Allied.

    Google Scholar 

  113. Wittman, C. (2011). Comments 4x more valuable than likes. http://goo.gl/wnSES.

  114. Wright, J. D. (2011) Defining and measuring search bias: Some preliminary evidence. International center for law & economics, November 2011; George Mason Law & Economics Research Paper No. 12–14. Available at SSRN: http://ssrn.com/abstract=2004649.

  115. Yu, C., Lakshmanan, L., & Amer-Yahia, S. (2009). It takes variety to make a world: Diversification in recommender systems. In Proceedings of the 12th international conference on extending database technology: Advances in database technology (pp. 368–378). http://dl.acm.org/citation.cfm?id=1516404.

  116. Yue, Y., Patel, R., & Roehrig, H. (2010). Beyond position bias: Examining result attractiveness as a source of presentation bias in click through data. In Proceedings of the 19th international conference on World wide web (pp. 1011–1018). http://dl.acm.org/citation.cfm?id=1772793.

  117. Zhang, M., & Hurley, N. (2008). Avoiding monotony: Improving the diversity of recommendation lists, 2008 ACM international conference on recommender systems (ACM Recsys’08) (pp. 123–130). Switzerland: Lausanne.

    Google Scholar 

  118. Zimmer, M. (2011). Facebook’s censorship problem. June. http://www.huffingtonpost.com/michael-zimmer/facebooks-censorship-prob_b_852001.html.

Download references

Acknowledgments

The author would like to thank Martijn Warnier and Ibo van de Poel for their valuable comments. This research is supported by the Netherlands Organization for Scientific Research (NWO) Mozaiek grant, file number 017.007.111.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Engin Bozdag.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Bozdag, E. Bias in algorithmic filtering and personalization. Ethics Inf Technol 15, 209–227 (2013). https://doi.org/10.1007/s10676-013-9321-6

Download citation

Keywords

  • Information politics
  • Bias
  • Social filtering
  • Algorithmic gatekeeping