Abstract
In this chapter we will look at the dangers and limitations that language models bring, with a focus on bias. Bias in AI in general, and regarding language models in particular, is a topic that was neglected for many years of technology development. In the recent years, after some disturbing examples of discrimination caused by bias in AI software have made it to the broad media, the topic is explored by research and finally starts getting the attention it deserves. We will also discuss other risks such as the ecological footprint or the sometimes critical working conditions behind the scenes of machine learning training.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
Alice and Bob are typical names used in computer science as placeholders in explanations: https://en.wikipedia.org/wiki/Alice_and_Bob.
- 3.
https://huggingface.co/course/chapter1/8?fw=pt. Hugging Face is a library often used by data engineers working with transformer-based models.
- 4.
Originally published online at https://www.societybyte.swiss/en/2022/12/22/hi-chatgpt-are-you-biased/
- 5.
Later another article reported that OpenAI paid 12.50 to the company for these services (Beuth et al. 2023).
References
Ahn J, Oh A (2021) Mitigating Language-Dependent Ethnic Bias in BERT. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (pp. 533-549).
Bender EM, Gebru T, McMillan-Major A, Shmitchell S (2021) On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?. In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency (pp. 610-623).
Beuth P, Hoffmann H, Hoppenstedt M (2023) Die Gesichter hinter der KI. Der Spiegel Nr. 29.
Bolukbasi T, Chang KW, Zou JY, Saligrama V, Kalai AT (2016) Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Advances in neural information processing systems, 29.
Caliskan A, Bryson JJ, Narayanan A (2017) Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186.
Crawford K (2021) Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence. Yale University Press.
Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 4171-4186).
Dräger J, Müller-Eiselt R (2020) We Humans and the Intelligent Machines: How algorithms shape our lives and how we can make good use of them. Verlag Bertelsmann Stiftung.
Eubanks V (2018) Automating inequality: How high-tech tools profile, police, and punish the poor. St. Martin's Press.
Joshi P, Santy S, Budhiraja A, Bali K, Choudhury M (2020) The State and Fate of Linguistic Diversity and Inclusion in the NLP World. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 6282-6293).
Perrigo B (2023) Exclusive: OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic. The TIME. Available at https://time.com/6247678/openai-chatgpt-kenya-workers/, last accessed 23.05.2023.
Søraa RA (2023) AI for Diversity. CRC Press.
Strubell E, Ganesh A, McCallum A (2019) Energy and Policy Considerations for Deep Learning in NLP. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 3645-3650).
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Kurpicz-Briki, M. (2023). Stereotypes in Language Models. In: More than a Chatbot. Springer, Cham. https://doi.org/10.1007/978-3-031-37690-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-37690-0_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37689-4
Online ISBN: 978-3-031-37690-0
eBook Packages: Computer ScienceComputer Science (R0)