Skip to main content
Log in

The paradoxical transparency of opaque machine learning

  • Open Forum
  • Published:
AI & SOCIETY Aims and scope Submit manuscript

Abstract

This paper examines the paradoxical transparency involved in training machine-learning models. Existing literature typically critiques the opacity of machine-learning models such as neural networks or collaborative filtering, a type of critique that parallels the black-box critique in technology studies. Accordingly, people in power may leverage the models’ opacity to justify a biased result without subjecting the technical operations to public scrutiny, in what Dan McQuillan metaphorically depicts as an “algorithmic state of exception”. This paper attempts to differentiate the black-box abstraction that wraps around complex computational systems from the opacity of machine-learning models. It contends that the degree of asymmetry in knowledge is greater in the former than the latter. In the case of software systems, software codes are difficult to understand as only software experts with sufficient domain knowledge are equipped to formulate a sound critique. In contrast, the meanings of trained parameters in a machine-learning model are obscure even to the data scientists who configure and train the model. Hence, the asymmetry of knowledge lies only in how data examples are collected, the choice and configuration of machine-learning models, and the specification of features in model design. Under the trend of algorithmic decision-making proliferating with machine-learning heuristics, the paper contends that the more symmetric distribution of knowledge in machine learning could lead to a more transparent production process if proper policies are in place.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

Data availability

The manuscript has no associated data that need to be made available.

Notes

  1. Note that McQuillan does not employ the term “algorithmic governance” in “Algorithmic State of Exception” (2015) but he does use it in “Algorithmic Paranoia and the Convivial Alternative” (2016).

  2. “The technical object taken according to its essence, which is to say the technical object insofar as it has been invented, thought and willed, and taken up [assumé] by a human subject, becomes the medium [le support] and symbol of this relationship, which we would like to name transindividual” (Simondon 2016, p. 252). The transindividual reality is an inter-individual collective reality in which inter-human relations are “created through the intermediary of the technical objects” (2016, p. 254), and the relations with technical objects create “a coupling between the inventive and organizational capacities of several subjects” (2016, p. 258).

  3. Tertiary retention is Stiegler’s term for denoting a type of permanent social memory that is possible through technology. Writing, printing, database, YouTube, Facebook, are all examples of tertiary retentional systems.

  4. The article “Six open source security myths debunked—and eight real challenges to consider” (Heath 2013) also makes a similar argument.

  5. The complexity of an ML model is indicated by the number of parameters that the model can be trained with. Every ML model can be made arbitrarily complex. For instance, adding polynomial features can artificially expand the feature set of linear regression. Too many parameters may lead to overfitting, which can be attenuated by training the model with a very large training set. Thus Michele Banko and Eric Brill (2001) did an experiment, comparing the performances between different ML models trained with varying sizes of data set. They found out that all models give remarkably similar performances when there is enough data. Hence, they conclude, “it’s not who has the best algorithm that wins. It’s who has the most data.”.

  6. The two terms machine politics and technical politics have similar meanings. The former is taken from "Hard Choices in Artificial Intelligence" (Dobbe et al. 2021) and the latter from Andrew Feenberg’s works (e.g., see Technosystem (2017)). These works share the view that every technological system is inherently political, and they advocate collective agency and political deliberation during the technical design phase.

  7. For instance, running a neural network model requires just one round of forward propagation, whereas training a neural network model requires thousands of iterations of forward and backward propagations.

  8. E.g. Amazon Web Service (AWS) supports deep learning on their cloud services (https://aws.amazon.com/deep-learning/).

  9. Some companies may indeed develop or customize their code for the algorithms for training machine models. But it is nonetheless possible for regulatory policies to request the separation of this code into a software library that will not be subjected to third-party auditing.

  10. It is true though, that auditors may also come up with their own automated tools, embedded with ML models, for scanning anomalies or biases in either data or software code. It is conceivable that auditors can design and train their own ML models for detecting software code with fraudulent motives, similar to those models designed to detect fraudulent behaviour in online transactions. So if the legal issue of proprietary trade secret is resolved and policies for regulating third-party audit of software code are set in place, it is conceivable that these tools for auditing software may become available, making it feasible to conduct third-party audits of a large software codebase.

  11. There are works in the academia that exemplify how critiques become feasible when the design process of machine learning is transparent. One such example is Wendy Chun’s critique of the paper “Deep Neural Networks Are More Accurate Than Humans at Detecting Sexual Orientation From Facial Images” (Wang and Kosinski 2018) in Chapter 4 of her Discriminating Data (Chun 2021).

  12. I am using ‘formal bias’ as defined in Transforming Technology (Feenberg 2002, pp. 80–82).

  13. According to Zuboff (2019, p. 328), “[i]n this future we are exiles from our own behavior, denied access to or control over knowledge derived from our experience. Knowledge, authority, and power rest with surveillance capital, for which we are merely ‘human natural resources’.”.

  14. Note that Simondon also uses the term “regulative external milieu” to propose the proper relation between the social and cultural milieu and technology development (see 2016, pp. 49, 129).

  15. Lee et al. (2021, p. 12) discusses the limitations of some post-hoc explanation techniques in XAI. E.g., Local Interpretable Model-agnostic Explanations (LIME) “has been shown not to be robust: given two very similar inputs that result in very similar outputs from the model, LIME is not guaranteed to produce similar explanations.” Also, as Watson (2021, p. 10) puts it, it is questionable whether interpretable machine learning “really settle matters, or merely push the problem one rung up the latter”.

References

  • Agamben G (2005) State of exception. University of Chicago Press, Chicago

    Google Scholar 

  • Agamben G (2020) Giorgio Agamben, “The state of exception provoked by an unmotivated emergency”. In: positions politics. https://positionspolitics.org/giorgio-agamben-the-state-of-exception-provoked-by-an-unmotivated-emergency/. Accessed 17 Aug 2021.

  • Ananny M, Crawford K (2018) Seeing without knowing: limitations of the transparency ideal and its application to algorithmic accountability. New Media Soc 20(3):973–989

    Article  Google Scholar 

  • Araujo T, Helberger N, Kruikemeier S et al (2020) In AI we trust? Perceptions about automated decision-making by artificial intelligence. AI Soc 35(3):611–623. https://doi.org/10.1007/s00146-019-00931-w

    Article  Google Scholar 

  • Banko M and Brill E (2001) Scaling to very very large corpora for natural language disambiguation. In: Proceedings of the 39th annual meeting of the Association for Computational Linguistics, 2001, pp. 26–33

  • Berghoff C, Biggio B, Brummel E et al. (2021) Whitepaper: towards auditable AI systems, pp. 32

  • Boyd D, Crawford K (2011) Six provocations for big data. In: A decade in internet time: symposium on the dynamics of the internet and society, 2011

  • Brill J (2015) Scalable approaches to transparency and accountability in decisionmaking algorithms: remarks at the NYU conference on algorithms and accountability. Federal Trade Commission 28

  • Brooks FP (1975) The mythical man-month: essays on software engineering. Addison-Wesley Publisher Co, Reading

    Google Scholar 

  • Brown S, Davidovic J, Hasan A (2021) The algorithm audit: scoring the algorithms that score us. Big Data Soc 8(1):2053951720983865. https://doi.org/10.1177/2053951720983865

    Article  Google Scholar 

  • Burrell J (2016) How the machine ‘thinks’: understanding opacity in machine learning algorithms. Big Data Soc 3(1):2053951715622512

    Article  Google Scholar 

  • Carabantes M (2020) Black-box artificial intelligence: an epistemological and critical analysis. AI Soc 35(2):309–317

    Article  Google Scholar 

  • Chan L (2021) Explainable AI as epistemic representation. In: Overcoming opacity in machine learning, pp. 7–8

  • Chun WHK (2021) Discriminating data: correlation, neighborhoods, and the new politics of recognition. The MIT Press, Cambridge

    Book  Google Scholar 

  • Creel KA (2020) Transparency in complex computational systems. Philos Sci 87(4):568–589

    Article  MathSciNet  Google Scholar 

  • Crogan P (2019) Bernard Stiegler on Algorithmic Governmentality: A New Regimen of Truth? New Form 98:48–67. https://doi.org/10.3898/NEWF:98.04.2019

    Article  Google Scholar 

  • Datta A, Tschantz MC, Datta A (2015) Automated experiments on Ad privacy settings. Proc Priv Enhancing Technol 2015(1):92–112

    Article  Google Scholar 

  • Diakopoulos N (2016) Accountability in algorithmic decision making. Commun ACM 59(2):56–62

    Article  Google Scholar 

  • Dobbe R, Krendl Gilbert T, Mintz Y (2021) Hard choices in artificial intelligence. Artif Intell 300:103555. https://doi.org/10.1016/j.artint.2021.103555

    Article  MATH  Google Scholar 

  • Fainman AA (2019) The problem with Opaque AI. Thinker 82(4):44–55

    Article  Google Scholar 

  • Feenberg A (2002) Transforming technology: a critical theory revisited. Oxford University Press, New York

    Google Scholar 

  • Feenberg A (2017) Technosystem: the social life of reason. Harvard University Press, Cambridge

    Book  Google Scholar 

  • Heath N (2013) Six open source security myths debunked—and eight real challenges to consider. https://www.zdnet.com/article/six-open-source-security-myths-debunked-and-eight-real-challenges-to-consider/. Accessed 29 Apr 2022

  • Huby G, Harries J (2021) Bloody paperwork: algorithmic governance and control in UK integrated health and social care settings. J Extreme Anthropol 5(1):1–28. https://doi.org/10.5617/jea.8285

    Article  Google Scholar 

  • Jarrahi MH, Newlands G, Lee MK et al (2021) Algorithmic management in a work context. Big Data Soc 8(2):20539517211020332

    Article  Google Scholar 

  • Lee K-F, Chen Q (2021) AI 2041, 1st edn. Currency, New York

    Google Scholar 

  • Lee E, Taylor H, Hiley L et al (2021) Technical barriers to the adoption of post-hoc explanation methods for black box AI models. In: Overcoming opacity in machine learning, pp. 12–13

  • Levy E (2000) Wide open source. SecurityFocus. com. Electronic document, p. 19. http://www.securityfocus.com/news

  • Longoni C, Bonezzi A, Morewedge CK (2019) Resistance to medical artificial intelligence. J Consumer Res 46(4):629–650

    Article  Google Scholar 

  • Malik MM (2020) A hierarchy of limitations in machine learning. arXiv preprint arXiv:2002.05193

  • McKinney SM, Sieniek M, Godbole V et al (2020) International evaluation of an AI system for breast cancer screening. Nature 577(7788):89–94

    Article  Google Scholar 

  • McQuillan D (2015) Algorithmic states of exception. Eur J Cult Stud 18(4–5):564–576

    Article  Google Scholar 

  • McQuillan D (2016) Algorithmic paranoia and the convivial alternative. Big Data Soc 3(2):2053951716671340

    Article  Google Scholar 

  • Minsky M (1967) Why programming is a good medium for expressing poorly understood and sloppily formulated ideas. In: Design and planning II-computers in design and communication. New York, Hastings House, pp. 120–125

  • Mittelstadt BD, Allo P, Taddeo M et al (2016) The ethics of algorithms: mapping the debate. Big Data Soc 3(2):2053951716679679

    Article  Google Scholar 

  • Müller VC (2021) Deep opacity undermines data protection and explainable artificial intelligence. In: Overcoming opacity in machine learning, pp. 18–21

  • Ozment A, Schechter SE (2006) Milk or wine: does software security improve with age? USENIX Secur Symp 2006:10–5555

    Google Scholar 

  • Pasquale F (2015) The black box society. Harvard University Press

    Book  Google Scholar 

  • Pūraitė A, Zuzevičiūtė V, Bereikienė D et al. (2020) Algorithmic governance in public sector: is digitization a key to effective management. https://repository.mruni.eu/handle/007/17025. Accessed 17 Aug 2021.

  • Raymond ES (2001) The Cathedral and the Bazaar: musings on Linux and open source by an accidental revolutionary, Rev. O’Reilly, Cambridge

    Google Scholar 

  • Rouvroy A, Berns T (2013a) Algorithmic governmentality and prospects of emancipation. Reseaux 177(1):163–196

    Google Scholar 

  • Rouvroy A, Berns T (2013b) Gouvernementalité algorithmique et perspectives d’émancipation. Reseaux 177(1):163–196

    Google Scholar 

  • Sandvig C, Hamilton K, Karahalios K et al (2014) Auditing algorithms: research methods for detecting discrimination on internet platforms. Data Discrimination: Converting Critical Concerns into Productive Inquiry 22:4349–4357

    Google Scholar 

  • Schryen G (2011) Is open source security a myth? Commun ACM 54(5):130–140. https://doi.org/10.1145/1941487.1941516

    Article  Google Scholar 

  • Seaver N (2017) Algorithms as culture: Some tactics for the ethnography of algorithmic systems. Big Data Soc 4(2):2053951717738104

    Article  Google Scholar 

  • Silver D, Hubert T, Schrittwieser J et al (2018) A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362(6419):1140–1144

    Article  MathSciNet  MATH  Google Scholar 

  • Simondon G (2016) On the mode of existence of technical objects. Univocal Publisher, Minneapolis

    Google Scholar 

  • Smith GJ (2020) The politics of algorithmic governance in the black box city. Big Data Soc 7(2):2053951720933989. https://doi.org/10.1177/2053951720933989

    Article  Google Scholar 

  • Stiegler B (2016) Automatic society: the future of work. Polity Press, Cambridge

    Google Scholar 

  • Stokes JM, Yang K, Swanson K et al (2020) A deep learning approach to antibiotic discovery. Cell 180(4):688–702

    Article  Google Scholar 

  • Sullivan E (2020) Understanding from machine learning models. In: Sps S (ed) The British journal for the philosophy of science. The University of Chicago Press, Chicago

    Google Scholar 

  • Supreme Audit Institutions (2020) Auditing Machine Learning Algorithms. https://www.auditingalgorithms.net/index.html. Accessed 16 August 2021

  • US National Security Commission (2021) NSCAI Final Report. https://www.nscai.gov/. Accessed 20 May 2021.

  • Wang Y, Kosinski M (2018) Deep neural networks are more accurate than humans at detecting sexual orientation from facial images. J Personal Soc Psychol 114(2):246

    Article  Google Scholar 

  • Watson DS (2021) No explanation without inference. In: Overcoming opacity in machine learning, pp. 9–11

  • Weizenbaum J (1976) Computer power and human reason: from judgment to calculation. Freeman, San Francisco

    Google Scholar 

  • Zednik C, Boelsen H (2021) Preface: overcoming opacity in machine learning. In: Overcoming opacity in machine learning, pp. 1–2

  • Zou S (2021) Disenchanting trust: instrumental reason, algorithmic governance, and China’s emerging social credit system. Media Commun 9(2):140–149. https://doi.org/10.17645/mac.v9i2.3806

    Article  Google Scholar 

  • Zuboff S (2019) The age of surveillance capitalism: the fight for a human future at the new frontier of power, 1st edn. PublicAffairs, New York

    Google Scholar 

Download references

Funding

The author has no financial or proprietary interests in any material discussed in this article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Felix Tun Han Lo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lo, F.T.H. The paradoxical transparency of opaque machine learning. AI & Soc (2022). https://doi.org/10.1007/s00146-022-01616-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00146-022-01616-7

Keywords

Navigation