Stereotypes in Language Models

Kurpicz-Briki, Mascha

doi:10.1007/978-3-031-37690-0_6

Mascha Kurpicz-Briki²

230 Accesses

Abstract

In this chapter we will look at the dangers and limitations that language models bring, with a focus on bias. Bias in AI in general, and regarding language models in particular, is a topic that was neglected for many years of technology development. In the recent years, after some disturbing examples of discrimination caused by bias in AI software have made it to the broad media, the topic is explored by research and finally starts getting the attention it deserves. We will also discuss other risks such as the ecological footprint or the sometimes critical working conditions behind the scenes of machine learning training.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 24.99; Price excludes VAT (USA)

Softcover Book: USD 16.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In this chapter, the focus is on text-processing technologies. If you are interested in bias in AI in general, you might want to look at Dräger and Müller-Eiselt (2020) or Eubanks (2018).
2.
Alice and Bob are typical names used in computer science as placeholders in explanations: https://en.wikipedia.org/wiki/Alice_and_Bob.
3.
https://huggingface.co/course/chapter1/8?fw=pt. Hugging Face is a library often used by data engineers working with transformer-based models.
4.
Originally published online at https://www.societybyte.swiss/en/2022/12/22/hi-chatgpt-are-you-biased/
5.
Later another article reported that OpenAI paid 12.50 to the company for these services (Beuth et al. 2023).

References

Ahn J, Oh A (2021) Mitigating Language-Dependent Ethnic Bias in BERT. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (pp. 533-549).
Google Scholar
Bender EM, Gebru T, McMillan-Major A, Shmitchell S (2021) On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?. In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency (pp. 610-623).
Google Scholar
Beuth P, Hoffmann H, Hoppenstedt M (2023) Die Gesichter hinter der KI. Der Spiegel Nr. 29.
Google Scholar
Bolukbasi T, Chang KW, Zou JY, Saligrama V, Kalai AT (2016) Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Advances in neural information processing systems, 29.
Google Scholar
Caliskan A, Bryson JJ, Narayanan A (2017) Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186.
Article Google Scholar
Crawford K (2021) Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence. Yale University Press.
Book Google Scholar
Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 4171-4186).
Google Scholar
Dräger J, Müller-Eiselt R (2020) We Humans and the Intelligent Machines: How algorithms shape our lives and how we can make good use of them. Verlag Bertelsmann Stiftung.
Google Scholar
Eubanks V (2018) Automating inequality: How high-tech tools profile, police, and punish the poor. St. Martin's Press.
Google Scholar
Joshi P, Santy S, Budhiraja A, Bali K, Choudhury M (2020) The State and Fate of Linguistic Diversity and Inclusion in the NLP World. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 6282-6293).
Google Scholar
Perrigo B (2023) Exclusive: OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic. The TIME. Available at https://time.com/6247678/openai-chatgpt-kenya-workers/, last accessed 23.05.2023.
Søraa RA (2023) AI for Diversity. CRC Press.
Google Scholar
Strubell E, Ganesh A, McCallum A (2019) Energy and Policy Considerations for Deep Learning in NLP. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 3645-3650).
Google Scholar

Download references

Author information

Authors and Affiliations

Applied Machine Intelligence, Bern University of Applied Sciences, Biel/Bienne, Switzerland
Mascha Kurpicz-Briki

Authors

Mascha Kurpicz-Briki
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kurpicz-Briki, M. (2023). Stereotypes in Language Models. In: More than a Chatbot. Springer, Cham. https://doi.org/10.1007/978-3-031-37690-0_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-37690-0_6
Published: 07 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37689-4
Online ISBN: 978-3-031-37690-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics