Abstract
This article presents Epistemic Spatialization as a new framework for investigating the interconnected patterns of biases when identifying objects with convolutional neural networks (convnets). It draws upon Foucault’s notion of spatialized knowledge to guide its method of enquiry. We argue that decisions involved in the creation of algorithms, alongside the labeling, ordering, presentation, and commercial prioritization of objects, together create a distorted “nomination of the visible”: they harden the visibility of some objects, make other objects excessively visible, and consign yet others to permanent or haphazard invisibility. Our approach differs from those who focus on high-stakes misidentifications, such as errors tied to structural racism. Examining the far more dominant series of low-stakes mistakes shows the scope of errors, destabilizing the goal of image content identification with considerable societal impact. We explore these issues by closely examining the demonstration video of a popular convnet. This examination reveals an interlocking series of biases undermining the content identification process. The picture we paint is crucial for a better understanding of the errors that result as these convnets become further embedded in everyday products. The framework is valuable for critical work on computer vision, AI studies, and large-scale visual analysis.
Similar content being viewed by others
References
Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Ctro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow IJ, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Lvenberg J, Mane D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viegas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2016) TensorFlow: large-scale machine learning on heterogeneous distributed systems. arXiv. [Online]. https://arxiv.org/abs/1603.04467. (Accessed on 28 March 2019)
Analytics Insight (2022) OpenAI’s DALL-E 2 can put an end to image recognition issues. Analytics Insight, Hyderabad. [Online]. https://www.AnalyticsInsight.net/openais-dalle-2-can-put-an-end-to-image-recognition-issues/. (Accessed on 20 April 2022)
Ananny M, Crawford K (2018) Seeing without knowing: limitations of the transparency ideal and its application to algorithmic accountability. New Med Soc. [Online] 3. https://doi.org/10.1177/1461444816676645. (Accessed on 17 June 2019)
Athalye A, Engstrom L, Ilyas A, Kwok, K. (2018) Synthesizing robust adversarial examples. arXiv. [Online]. https://arxiv.org/abs/1707.07397. (Accessed on 12 January 2019)
Barocas S, Hardt M, Narayanan A (2019) Fairness and machine learning: limitations and opportunities. fairmlbook.org
Basl J, Sandler R, Tiell S (2021) Getting from commitment to content in AI and data ethics: justice and explainability. Atlantic Council, Washington, D.C. [Online]. https://www.atlanticcouncil.org/in-depth-research-reports/report/specifying-normative-content/. (Accessed on 18 October 2021)
Benjamin R (2019) Race after technology. Polity, Cambridge
Bolya D, Zhou C, Xiao F, Lee YJ (2019) YOLACT: real-time instance segmentation. arXiv. [Online]. https://arxiv.org/abs/1904.02689. (Accessed on 12 June 2020)
Brinkmann L, Gezerli D, Kleist KV, Müller TF, Rahwan I, Pescetelli N (2022) Hybrid social learning in human-algorithm cultural transmission. Philos Trans R Soc A. [Online] 2227. https://doi.org/10.1098/rsta.2020.0426 (Accessed on 25 May 2022)
Buolamwini J, Gebru T (2018) Gender shades: intersectional accuracy disparities in commercial gender classification. Proc Mach Learn Res 81(1):1–15
Burrell J (2016) How the machine ‘Thinks’: understanding opacity in machine learning algorithms.' Big Data Soc. [Online]. https://doi.org/10.1177/2053951715622512. (Accessed on 12 May 2018)
Crawford K (2021) Atlas of AI. Yale University Press, New Haven
Crawford K, Paglen T (2019) Excavating AI: the politics of images in machine learning training sets. AI Now Institute, New York. [Online]. https://www.excavating.ai/. (Accessed on 20 January 2020)
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: 2009 IEEE computer society conference on computer vision and pattern recognition (CVPR 2009). [Online]. https://doi.org/10.1109/CVPR.2009.5206848. (Accessed on 20–25 June 2009)
Deng J, Russakovsky O, Krause J, Bernstein M, Berg A, Fei-Fei L (2014) Scalable multi-label annotation. In: CHI ’14 Proceedings of the SIGHCHI Conference on Human Factors in Computing Systems. [Online]. https://doi.org/10.1145/2556288.2557011. (Accessed on 26 April–1 May 2014)
Diakopoulos N (2019) Automating the news. Harvard University Press, Cambridge
Drainville R (2018) Iconography for the Age of Social Media. Humanities [Online] 1. https://doi.org/10.3390/h7010012. (Accessed on 26 January 2018)
Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature. [Online]. https://doi.org/10.1038/nature21056. Accessed on 16 April 2020
Fei-Fei L, Iyer A, Koch C, Perona P (2007) What do we perceive in a glance of a real-world scene?. J Vis. [Online] 1. https://doi.org/10.1167/7.1.10. (Accessed on 18 March 2019
Foucault M (1966) The order of things, 2009th edn. Routledge, London
Foucault M (1969) The archaeology of knowledge, 2008th edn. Routledge, London
Foucault M (1975) Discipline and punish, 1991st edn. Penguin, London
Gerrish S (2018) How smart machines think. MIT Press, Cambridge
Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. arXiv. [Online]. https://arxiv.org/abs/1412.6572. (Accessed on 1 April 2019)
Goodfellow IJ, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge
Goyal P, Caggiano V, Joulin A, Bojanowski P (2021) SEER: the start of a more powerful, flexible, and accessible era for computer vision. In: Research|computer vision. Facebook, Menlo Park. [Online]. https://ai.facebook.com/blog/seer-the-start-of-a-more-powerful-flexible-and-accessible-era-for-computer-vision/. (Accessed on 8 March 2021)
Gray ML, Suri S (2019) Ghost work. Houghton Mifflin Harcourt, Boston
Gu S, Lillicrap T, Sutskever I, Levine S (2016) Continuous deep Q-learning with model-based acceleration. In: Proceedings of the 33rd International Conference on Machine Learning, vol. 48. New York, 20–22 June 2016. PMLR, pp. 2829–2838.
Karpathy A (2014) What i learned from competing against a ConvNet on ImageNet. Github, San Francisco. [Online]. https://karpathy.github.io/2014/09/02/what-i-learned-from-competing-against-a-convnet-on-imagenet/. (Accessed on 9 January 2019)
Kelleher JD (2019) Deep learning. MIT Press, Cambridge
Kirillov A, He K, Girschick R, Rother C, Dollár P (2018) Panoptic segmentation. arXiv. [Online]. https://arxiv.org/abs/1801.00868. (Accessed on 28 April 2019)
Kirillov A, Girschick R, He K, Dollár P (2019) Panoptic feature pyramid networks. arXiv. [Online]. http://arxiv.org/abs/1901.02446. (Accessed on 28 April 2019)
Krishna R, Hata K, Chen S, Kravitz J, Shamma DA, Fei-Fei L, Bernstein M (2016a) Embracing error to enable rapid crowdsourcing. arxiv. [Online]. https://arxiv.org/abs/1602.04506. (Accessed on 28 May 2019)
Krishna R, Zhu Y, Groth O, Johnson J, Hata K, Kravitz J, Chen S, Kalantidis Y, Li L-J, Shamma DA, Bernstein M, Fei-Fei L (2016b) Visual genome: connecting language and vision using crowdsourced dense image annotations. arXiv. [Online]. [Accessed on 22 February 2019]. https://arxiv.org/abs/1602.07332
Krizhevsky A, Hinton GE (2009) Learning multiple layers of features from tiny images. University of Toronto, Toronto. [Online]. https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf. (Accessed on 2 April 2019)
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Adv Neural Inform Process Syst. [Online]. https://doi.org/10.1145/3065386. (Accessed on 20 March 2019)
Lin T, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick CL, Dollár P (2014) Microsoft COCO: common objects in context. arXiv. [Online]. https://arxiv.org/abs/1405.0312. (Accessed on 4 April 2019)
Miller GA, Beckwith R, Fellbaum C (1990) Introduction to WordNet: an on-line lexical database. Int J Lexicogr. [Online] 4. [Accessed]. https://doi.org/10.1093/ijl/3.4.235
Mishkin P (2022) DALL•E 2 preview: risks and limitations. San Francisco: Github. [Online]. https://github.com/openai/dalle-2-preview/blob/main/system-card.md. (Accessed on 20 April 2022)
Ng A (2014) Deep learning: maching learning via large-scale brain simulations. In: Robotics Science and Systems (RSS) 2014. Berkeley, CA, 16 July 2014.
Ng A (2019) machine learning yearning: technical strategy for AI engineers, in the era of deep learning. Stanford: deeplearning.ai. [Online]. https://www.deeplearning.ai/programs/. (Accessed on 23 April 2022)
Nguyen A, Yosinski J, Clune J (2015) 'Deep Neural networks are easily fooled: high confidence predictions for unrecognizable images.In: IEEE conference on computer vision and pattern recognition (CVPR). [Online]. https://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Nguyen_Deep_Neural_Networks_2015_CVPR_paper.html. (Accessed on 22 May 2019)
Noble SU (2018) Algorithms of oppression. New York University Press, New York
Olah C, Mordvinstev A, Schubert L (2017) Feature visualization. Distill. [Online]. https://doi.org/10.23915/distill.00007. (Accessed on 28 April 2019)
Patel NV (2017) Why doctors aren’t afraid of better, more efficient AI diagnosing cancer. Daily Beast, New York. [Online]. https://www.thedailybeast.com/why-doctors-arent-afraid-of-better-more-efficient-ai-diagnosing-cancer. (Accessed on 12 December 2018)
Powell A (2021) Explanations as governance? Investigating practices of explanation in algorithmic system design. Eur J Commun. [Online] 4. https://doi.org/10.1177/02673231211028376. (Accessed on 4 January 2022)
Prabhu VU, Birhane A (2020) Large datasets: a pyrrhic win for computer vision?. arXiv. [Online]. https://arxiv.org/abs/2006.16923v2. (Accessed on 24 October 2020)
Quach K (2020) MIT apologises, permanently pulls offline huge dataset that taught AI systems to use racist, misogynistic slurs. London: The Register. [Online]. https://www.theregister.com/2020/07/01/mit_dataset_removed/. (Accessed on 24 October 2020)
Rajchman J (1988) Foucault’s art of seeing. [Online] Spring. https://doi.org/10.2307/778976. (Accessed on 4 February 2019)
Redmon J (2018) YOLOv3. San Fransisco: YouTube. [Online]. https://www.youtube.com/watch?v=MPU2HistivI. (Accessed on 14 December 2021)
Redmon J, Divvala S, Girshick R, Farhadi A (2015) You only look once: unified, real-time object detection. arXiv. [Online]. https://arxiv.org/abs/1506.02640. (Accessed on 23 April 2019)
Rogers R (2021) Visual media analysis for instagram and other online platforms. Big Data Soc. [Online] 1. https://doi.org/10.1177/20539517211022370. (Accessed on 4 August 2021)
Rosenfeld A, Zemel R, Tsotsos JK (2018) The elephant in the room. arXiv. [Online]. https://arxiv.org/abs/arXiv:1808.03305v1. (Accessed on 25 September 2018)
Russakovsky O, Deng J, Su H, Krause J, Stheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2014) ImageNet large scale recognition challenge. arXiv. [Online]. https://arxiv.org/abs/1409.0575. (Accessed on 6 January 2019)
Sharir O, Peleg B, Shoham Y (2020) The cost of training NLP models: a concise overview. arXiv. [Online]. https://arxiv.org/abs/2004.08900. (Accessed on 26 February 2021)
Shin, D. (2020) 'The Effects of Explainability and Causability on Perception, Trust, and Acceptance: Implications for Explainable AO.' International Journal of Human-Computer Studies. [Online]. https://doi.org/10.1016/j.ijhcs.2020.102551. (Accessed on 15 January 2022)
Shin D (2022) How do people judge the credibility of algorithmic sources? AI Soc. [Online]. https://doi.org/10.1007/s00146-021-01158-4. (Accessed on 18 February 2022)
Shin D, Razul A, Fotiadis A (2021) Why am i seeing this? Deconstructing algorithm literacy through the lens of users. Internet Research. [Online]. https://doi.org/10.1108/INTR-02-2021-0087. (Accessed on 15 January 2022)
Shin D, Kee KF, Shin EY (2022) Algorithm awareness: why user awareness is critical for personal privacy in the adoption of algorithmic platforms?. Int J Inform Manag. [Online]. https://doi.org/10.1016/j.ijinfomgt.2022.102494. (Accessed on 23 May 2022)
Small Z (2019) 600,000 images removed from AI database after art project exposes racist bias. New York: Hyperallergic. [Online]. https://hyperallergic.com/518822/600000-images-removed-from-ai-database-after-art-project-exposes-racist-bias/. (Accessed on 25 January 2020)
Snow J (2018) Amazon’s face recognition falsely matched 28 members of congress with mugshots. American Civil Liberties Union. [Online]. https://www.aclu.org/blog/privacy-technology/surveillance-technologies/amazons-face-recognition-falsely-matched-28. (Accessed on 14 August 2018)
VB Staff (2017) Why do 87% of data science projects never make it into production?: VentureBeat. [Online]. https://venturebeat.com/2019/07/19/why-do-87-of-data-science-projects-never-make-it-into-production/. (Accessed on 15 December 2019)
Stepnick A, Martin A, Benedetti A, Karsgaard C, Ng CYT, Major D, Garcia-Mingo E, Granzotto F, Maia G, Gullal Krol J, van Vliet L, Geboers M, Kuculo T (2020) Black squares as (In)authentic behaviour: displays of solidarity on Twitter, Instagram, and Facebook. University of Amsterdam: Digital Methods Initiative. [Online]. https://wiki.digitalmethods.net/Dmi/SummerSchool2020BlackSquares. (Accessed on 8 April 2021)
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2015) Rethinking the inception architecture for computer vision. arXiv. [Online]. [Accessed on 12 January 2018]. https://arxiv.org/abs/1512.00567v3
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2014) 'Going Deeper with Convolutions.' arXiv. [Online]. [Accessed on 22 September 2018] https://arxiv.org/abs/1409.4842
Torralba A, Fergus R, Freeman WT (2008) 80 million tiny images: a large dataset for non-parametric object and scene recognition. In: IEEE Transactions on Pattern Analysis and Machine Intelligence. [Online] 11. https://doi.org/10.1109/TPAMI.2008.128. (Accessed on 6 June 2019)
Tubaro P, Casilli AA, Coville M (2020) The trainer, the verifier, the imitator: three ways in which human platorm workers support artificial intelligence. Big Data Soc. [Online] 1. https://doi.org/10.1177/2053951720919776. (Accessed on 14 March 2021)
W3Tech (2021) Usage Statistics of Content Languages for Websites. Maria Enzersdorf: W3Techs/Q-Success. [Online]. https://w3techs.com/technologies/overview/content_language. (Accessed on 15 January 2022)
YOLO Object Detection (2016) YOLO v2. San Francisco: YouTube. [Online] https://www.youtube.com/watch?v=VOC3huqHrss (Accessed on 14 December 2021)
Acknowledgements
The authors thank the feedback provided by the following: London Microdot Creative Visions (15 June 2019), NYU/Columbia University BLM Workshop (6 January 2020), the Stratford School of Interaction Design and Business’ WIPS series (19 November 2020), Tabatha Dominguez, Laura Fong, Paul Giladi, Jennifer Saul, and Karin Schmidlin.
Funding
The authors disclosed receipt of the following financial support for the research of this article: this work was supported by a Creative Economy Engagement Fellowship for “Imagining Futures” provided by the UK Arts and Humanities Research Council.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Drainville, R., Vis, F. Elephant motorbikes and too many neckties: epistemic spatialization as a framework for investigating patterns of bias in convolutional neural networks. AI & Soc (2022). https://doi.org/10.1007/s00146-022-01542-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00146-022-01542-8