The paradoxical transparency of opaque machine learning

Lo, Felix Tun Han

doi:10.1007/s00146-022-01616-7

The paradoxical transparency of opaque machine learning

Open Forum
Published: 19 December 2022

(2022)
Cite this article

AI & SOCIETY Aims and scope Submit manuscript

Felix Tun Han Lo ORCID: orcid.org/0000-0003-1949-8286¹

509 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

This paper examines the paradoxical transparency involved in training machine-learning models. Existing literature typically critiques the opacity of machine-learning models such as neural networks or collaborative filtering, a type of critique that parallels the black-box critique in technology studies. Accordingly, people in power may leverage the models’ opacity to justify a biased result without subjecting the technical operations to public scrutiny, in what Dan McQuillan metaphorically depicts as an “algorithmic state of exception”. This paper attempts to differentiate the black-box abstraction that wraps around complex computational systems from the opacity of machine-learning models. It contends that the degree of asymmetry in knowledge is greater in the former than the latter. In the case of software systems, software codes are difficult to understand as only software experts with sufficient domain knowledge are equipped to formulate a sound critique. In contrast, the meanings of trained parameters in a machine-learning model are obscure even to the data scientists who configure and train the model. Hence, the asymmetry of knowledge lies only in how data examples are collected, the choice and configuration of machine-learning models, and the specification of features in model design. Under the trend of algorithmic decision-making proliferating with machine-learning heuristics, the paper contends that the more symmetric distribution of knowledge in machine learning could lead to a more transparent production process if proper policies are in place.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Trust of Learning Systems: Considerations for Code, Algorithms, and Affordances for Learning

In Defense of Sociotechnical Pragmatism

Practical and Open Source Best Practices for Ethical Machine Learning

Data availability

The manuscript has no associated data that need to be made available.

Notes

Note that McQuillan does not employ the term “algorithmic governance” in “Algorithmic State of Exception” (2015) but he does use it in “Algorithmic Paranoia and the Convivial Alternative” (2016).
“The technical object taken according to its essence, which is to say the technical object insofar as it has been invented, thought and willed, and taken up [assumé] by a human subject, becomes the medium [le support] and symbol of this relationship, which we would like to name transindividual” (Simondon 2016, p. 252). The transindividual reality is an inter-individual collective reality in which inter-human relations are “created through the intermediary of the technical objects” (2016, p. 254), and the relations with technical objects create “a coupling between the inventive and organizational capacities of several subjects” (2016, p. 258).
Tertiary retention is Stiegler’s term for denoting a type of permanent social memory that is possible through technology. Writing, printing, database, YouTube, Facebook, are all examples of tertiary retentional systems.
The article “Six open source security myths debunked—and eight real challenges to consider” (Heath 2013) also makes a similar argument.
The complexity of an ML model is indicated by the number of parameters that the model can be trained with. Every ML model can be made arbitrarily complex. For instance, adding polynomial features can artificially expand the feature set of linear regression. Too many parameters may lead to overfitting, which can be attenuated by training the model with a very large training set. Thus Michele Banko and Eric Brill (2001) did an experiment, comparing the performances between different ML models trained with varying sizes of data set. They found out that all models give remarkably similar performances when there is enough data. Hence, they conclude, “it’s not who has the best algorithm that wins. It’s who has the most data.”.
The two terms machine politics and technical politics have similar meanings. The former is taken from "Hard Choices in Artificial Intelligence" (Dobbe et al. 2021) and the latter from Andrew Feenberg’s works (e.g., see Technosystem (2017)). These works share the view that every technological system is inherently political, and they advocate collective agency and political deliberation during the technical design phase.
For instance, running a neural network model requires just one round of forward propagation, whereas training a neural network model requires thousands of iterations of forward and backward propagations.
E.g. Amazon Web Service (AWS) supports deep learning on their cloud services (https://aws.amazon.com/deep-learning/).
Some companies may indeed develop or customize their code for the algorithms for training machine models. But it is nonetheless possible for regulatory policies to request the separation of this code into a software library that will not be subjected to third-party auditing.
It is true though, that auditors may also come up with their own automated tools, embedded with ML models, for scanning anomalies or biases in either data or software code. It is conceivable that auditors can design and train their own ML models for detecting software code with fraudulent motives, similar to those models designed to detect fraudulent behaviour in online transactions. So if the legal issue of proprietary trade secret is resolved and policies for regulating third-party audit of software code are set in place, it is conceivable that these tools for auditing software may become available, making it feasible to conduct third-party audits of a large software codebase.
There are works in the academia that exemplify how critiques become feasible when the design process of machine learning is transparent. One such example is Wendy Chun’s critique of the paper “Deep Neural Networks Are More Accurate Than Humans at Detecting Sexual Orientation From Facial Images” (Wang and Kosinski 2018) in Chapter 4 of her Discriminating Data (Chun 2021).
I am using ‘formal bias’ as defined in Transforming Technology (Feenberg 2002, pp. 80–82).
According to Zuboff (2019, p. 328), “[i]n this future we are exiles from our own behavior, denied access to or control over knowledge derived from our experience. Knowledge, authority, and power rest with surveillance capital, for which we are merely ‘human natural resources’.”.
Note that Simondon also uses the term “regulative external milieu” to propose the proper relation between the social and cultural milieu and technology development (see 2016, pp. 49, 129).
Lee et al. (2021, p. 12) discusses the limitations of some post-hoc explanation techniques in XAI. E.g., Local Interpretable Model-agnostic Explanations (LIME) “has been shown not to be robust: given two very similar inputs that result in very similar outputs from the model, LIME is not guaranteed to produce similar explanations.” Also, as Watson (2021, p. 10) puts it, it is questionable whether interpretable machine learning “really settle matters, or merely push the problem one rung up the latter”.

References

Agamben G (2005) State of exception. University of Chicago Press, Chicago
Google Scholar
Agamben G (2020) Giorgio Agamben, “The state of exception provoked by an unmotivated emergency”. In: positions politics. https://positionspolitics.org/giorgio-agamben-the-state-of-exception-provoked-by-an-unmotivated-emergency/. Accessed 17 Aug 2021.
Ananny M, Crawford K (2018) Seeing without knowing: limitations of the transparency ideal and its application to algorithmic accountability. New Media Soc 20(3):973–989
Article Google Scholar
Araujo T, Helberger N, Kruikemeier S et al (2020) In AI we trust? Perceptions about automated decision-making by artificial intelligence. AI Soc 35(3):611–623. https://doi.org/10.1007/s00146-019-00931-w
Article Google Scholar
Banko M and Brill E (2001) Scaling to very very large corpora for natural language disambiguation. In: Proceedings of the 39th annual meeting of the Association for Computational Linguistics, 2001, pp. 26–33
Berghoff C, Biggio B, Brummel E et al. (2021) Whitepaper: towards auditable AI systems, pp. 32
Boyd D, Crawford K (2011) Six provocations for big data. In: A decade in internet time: symposium on the dynamics of the internet and society, 2011
Brill J (2015) Scalable approaches to transparency and accountability in decisionmaking algorithms: remarks at the NYU conference on algorithms and accountability. Federal Trade Commission 28
Brooks FP (1975) The mythical man-month: essays on software engineering. Addison-Wesley Publisher Co, Reading
Google Scholar
Brown S, Davidovic J, Hasan A (2021) The algorithm audit: scoring the algorithms that score us. Big Data Soc 8(1):2053951720983865. https://doi.org/10.1177/2053951720983865
Article Google Scholar
Burrell J (2016) How the machine ‘thinks’: understanding opacity in machine learning algorithms. Big Data Soc 3(1):2053951715622512
Article Google Scholar
Carabantes M (2020) Black-box artificial intelligence: an epistemological and critical analysis. AI Soc 35(2):309–317
Article Google Scholar
Chan L (2021) Explainable AI as epistemic representation. In: Overcoming opacity in machine learning, pp. 7–8
Chun WHK (2021) Discriminating data: correlation, neighborhoods, and the new politics of recognition. The MIT Press, Cambridge
Book Google Scholar
Creel KA (2020) Transparency in complex computational systems. Philos Sci 87(4):568–589
Article MathSciNet Google Scholar
Crogan P (2019) Bernard Stiegler on Algorithmic Governmentality: A New Regimen of Truth? New Form 98:48–67. https://doi.org/10.3898/NEWF:98.04.2019
Article Google Scholar
Datta A, Tschantz MC, Datta A (2015) Automated experiments on Ad privacy settings. Proc Priv Enhancing Technol 2015(1):92–112
Article Google Scholar
Diakopoulos N (2016) Accountability in algorithmic decision making. Commun ACM 59(2):56–62
Article Google Scholar
Dobbe R, Krendl Gilbert T, Mintz Y (2021) Hard choices in artificial intelligence. Artif Intell 300:103555. https://doi.org/10.1016/j.artint.2021.103555
Article MATH Google Scholar
Fainman AA (2019) The problem with Opaque AI. Thinker 82(4):44–55
Article Google Scholar
Feenberg A (2002) Transforming technology: a critical theory revisited. Oxford University Press, New York
Google Scholar
Feenberg A (2017) Technosystem: the social life of reason. Harvard University Press, Cambridge
Book Google Scholar
Heath N (2013) Six open source security myths debunked—and eight real challenges to consider. https://www.zdnet.com/article/six-open-source-security-myths-debunked-and-eight-real-challenges-to-consider/. Accessed 29 Apr 2022
Huby G, Harries J (2021) Bloody paperwork: algorithmic governance and control in UK integrated health and social care settings. J Extreme Anthropol 5(1):1–28. https://doi.org/10.5617/jea.8285
Article Google Scholar
Jarrahi MH, Newlands G, Lee MK et al (2021) Algorithmic management in a work context. Big Data Soc 8(2):20539517211020332
Article Google Scholar
Lee K-F, Chen Q (2021) AI 2041, 1st edn. Currency, New York
Google Scholar
Lee E, Taylor H, Hiley L et al (2021) Technical barriers to the adoption of post-hoc explanation methods for black box AI models. In: Overcoming opacity in machine learning, pp. 12–13
Levy E (2000) Wide open source. SecurityFocus. com. Electronic document, p. 19. http://www.securityfocus.com/news
Longoni C, Bonezzi A, Morewedge CK (2019) Resistance to medical artificial intelligence. J Consumer Res 46(4):629–650
Article Google Scholar
Malik MM (2020) A hierarchy of limitations in machine learning. arXiv preprint arXiv:2002.05193
McKinney SM, Sieniek M, Godbole V et al (2020) International evaluation of an AI system for breast cancer screening. Nature 577(7788):89–94
Article Google Scholar
McQuillan D (2015) Algorithmic states of exception. Eur J Cult Stud 18(4–5):564–576
Article Google Scholar
McQuillan D (2016) Algorithmic paranoia and the convivial alternative. Big Data Soc 3(2):2053951716671340
Article Google Scholar
Minsky M (1967) Why programming is a good medium for expressing poorly understood and sloppily formulated ideas. In: Design and planning II-computers in design and communication. New York, Hastings House, pp. 120–125
Mittelstadt BD, Allo P, Taddeo M et al (2016) The ethics of algorithms: mapping the debate. Big Data Soc 3(2):2053951716679679
Article Google Scholar
Müller VC (2021) Deep opacity undermines data protection and explainable artificial intelligence. In: Overcoming opacity in machine learning, pp. 18–21
Ozment A, Schechter SE (2006) Milk or wine: does software security improve with age? USENIX Secur Symp 2006:10–5555
Google Scholar
Pasquale F (2015) The black box society. Harvard University Press
Book Google Scholar
Pūraitė A, Zuzevičiūtė V, Bereikienė D et al. (2020) Algorithmic governance in public sector: is digitization a key to effective management. https://repository.mruni.eu/handle/007/17025. Accessed 17 Aug 2021.
Raymond ES (2001) The Cathedral and the Bazaar: musings on Linux and open source by an accidental revolutionary, Rev. O’Reilly, Cambridge
Google Scholar
Rouvroy A, Berns T (2013a) Algorithmic governmentality and prospects of emancipation. Reseaux 177(1):163–196
Google Scholar
Rouvroy A, Berns T (2013b) Gouvernementalité algorithmique et perspectives d’émancipation. Reseaux 177(1):163–196
Google Scholar
Sandvig C, Hamilton K, Karahalios K et al (2014) Auditing algorithms: research methods for detecting discrimination on internet platforms. Data Discrimination: Converting Critical Concerns into Productive Inquiry 22:4349–4357
Google Scholar
Schryen G (2011) Is open source security a myth? Commun ACM 54(5):130–140. https://doi.org/10.1145/1941487.1941516
Article Google Scholar
Seaver N (2017) Algorithms as culture: Some tactics for the ethnography of algorithmic systems. Big Data Soc 4(2):2053951717738104
Article Google Scholar
Silver D, Hubert T, Schrittwieser J et al (2018) A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362(6419):1140–1144
Article MathSciNet MATH Google Scholar
Simondon G (2016) On the mode of existence of technical objects. Univocal Publisher, Minneapolis
Google Scholar
Smith GJ (2020) The politics of algorithmic governance in the black box city. Big Data Soc 7(2):2053951720933989. https://doi.org/10.1177/2053951720933989
Article Google Scholar
Stiegler B (2016) Automatic society: the future of work. Polity Press, Cambridge
Google Scholar
Stokes JM, Yang K, Swanson K et al (2020) A deep learning approach to antibiotic discovery. Cell 180(4):688–702
Article Google Scholar
Sullivan E (2020) Understanding from machine learning models. In: Sps S (ed) The British journal for the philosophy of science. The University of Chicago Press, Chicago
Google Scholar
Supreme Audit Institutions (2020) Auditing Machine Learning Algorithms. https://www.auditingalgorithms.net/index.html. Accessed 16 August 2021
US National Security Commission (2021) NSCAI Final Report. https://www.nscai.gov/. Accessed 20 May 2021.
Wang Y, Kosinski M (2018) Deep neural networks are more accurate than humans at detecting sexual orientation from facial images. J Personal Soc Psychol 114(2):246
Article Google Scholar
Watson DS (2021) No explanation without inference. In: Overcoming opacity in machine learning, pp. 9–11
Weizenbaum J (1976) Computer power and human reason: from judgment to calculation. Freeman, San Francisco
Google Scholar
Zednik C, Boelsen H (2021) Preface: overcoming opacity in machine learning. In: Overcoming opacity in machine learning, pp. 1–2
Zou S (2021) Disenchanting trust: instrumental reason, algorithmic governance, and China’s emerging social credit system. Media Commun 9(2):140–149. https://doi.org/10.17645/mac.v9i2.3806
Article Google Scholar
Zuboff S (2019) The age of surveillance capitalism: the fight for a human future at the new frontier of power, 1st edn. PublicAffairs, New York
Google Scholar

Download references

Funding

The author has no financial or proprietary interests in any material discussed in this article.

Author information

Authors and Affiliations

School of Communication, Simon Fraser University, Burnaby, Canada
Felix Tun Han Lo

Authors

Felix Tun Han Lo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Felix Tun Han Lo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Lo, F.T.H. The paradoxical transparency of opaque machine learning. AI & Soc (2022). https://doi.org/10.1007/s00146-022-01616-7

Download citation

Received: 27 January 2022
Accepted: 29 November 2022
Published: 19 December 2022
DOI: https://doi.org/10.1007/s00146-022-01616-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The paradoxical transparency of opaque machine learning

Abstract

Access this article

Similar content being viewed by others

Trust of Learning Systems: Considerations for Code, Algorithms, and Affordances for Learning

In Defense of Sociotechnical Pragmatism

Practical and Open Source Best Practices for Ethical Machine Learning

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The paradoxical transparency of opaque machine learning

Abstract

Access this article

Similar content being viewed by others

Trust of Learning Systems: Considerations for Code, Algorithms, and Affordances for Learning

In Defense of Sociotechnical Pragmatism

Practical and Open Source Best Practices for Ethical Machine Learning

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation