Exploring the Impact of Gender Bias Mitigation Approaches on a Downstream Classification Task

Sobhani, Nasim; Delany, Sarah Jane

doi:10.1007/978-3-031-16564-1_10

Nasim Sobhani¹² &
Sarah Jane Delany¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13515))

Included in the following conference series:

International Symposium on Methodologies for Intelligent Systems

853 Accesses

Abstract

Natural language models and systems have been shown to reflect gender bias existing in training data. This bias can impact on the downstream task that machine learning models, built on this training data, are to accomplish. A variety of techniques have been proposed to mitigate gender bias in training data. In this paper we compare different gender bias mitigation approaches on a classification task. We consider mitigation techniques that manipulate the training data itself, including data scrubbing, gender swapping and counterfactual data augmentation approaches. We also look at using de-biased word embeddings in the representation of the training data. We evaluate the effectiveness of the different approaches at reducing the gender bias in the training data and consider the impact on task performance. Our results show that the performance of the classification task is not affected adversely by many of the bias mitigation techniques but we show a significant variation in the effectiveness of the different gender bias mitigation techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Badilla, P., Bravo-Marquez, F., Pérez, J.: Wefe: the word embeddings fairness evaluation framework. In: Proceedings of IJCAI (2020)
Google Scholar
Blodgett, S.L., et al.: Stereotyping Norwegian salmon: An inventory of pitfalls in fairness. In: Proceedings of ACL (2021)
Google Scholar
Bolukbasi, T., et al.: Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In: Advances in NeurIPS (2016)
Google Scholar
Caliskan, A., Bryson, J.J., Narayanan, A.: Semantics derived automatically from language corpora contain human-like biases. Science (2017)
Google Scholar
Cao, Y.T., et al.: Toward gender-inclusive coref. resolution. In: Proceedings of ACL (2020)
Google Scholar
De-Arteaga, M., othersRomanov, A., Wallach, H., et al.: Bias in bios: a case study of semantic representation bias in a high-stakes setting. In: Proceedings of FAT* (2019)
Google Scholar
Dixon, L., Li, J., Sorensen, J., Thain, N., Vasserman, L.: Measuring and mitigating unintended bias in text classification. In: Proceedings of AAAI/ACM Conference on AIES (2018)
Google Scholar
Gonen, H., et al.: Lipstick on a pig: debiasing methods cover up systematic gender biases in word embeddings but do not remove them. In: Proceedings of NAACL (2019)
Google Scholar
Hall Maudslay, R., et al.: It’s all in the name: mitigating gender bias with name-based counterfactual data substitution. In: Proceedings of EMNLP-IJCNLP (2019)
Google Scholar
Hardt, M., et al.: Equality of opportunity in supervised learning. NIPS (2016)
Google Scholar
Kiritchenko, S., Mohammad, S.: Examining gender and race bias in two hundred sentiment analysis systems. In: Proceedings of Conference on SEM (2018)
Google Scholar
Kurita, K., Vyas, N., Pareek, A., et al.: Measuring bias in contextualized word representations. In: Proceedings of 1st workshop on Gender Bias in NLP (2019)
Google Scholar
Lu, K., et al.: Gender Bias in Neural Natural Language Processing. arXiv (2018)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., et al.: Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS (2013)
Google Scholar
Nadeem, M., Bethke, A., Reddy, S.: StereoSet: Measuring stereotypical bias in pretrained language models. In: Proceedings of ACL (2021)
Google Scholar
Park, J.H., et al.: Reducing gender bias in abusive language. In: EMNLP (2018)
Google Scholar
Prost, F., Thain, N., Bolukbasi, T.: Debiasing embeddings for reduced gender bias in text classification. In: Proceedings of the 1st Workshop on Gender Bias in NLP (2019)
Google Scholar
Rudinger, R., et al.: Social bias in elicited nli. In: Proceedings of ACL on Ethics (2017)
Google Scholar
Speer, R., et al.: An open multilingual graph of general knowledge. In: AAAI (2017)
Google Scholar
Stanczak, K., et al.: A survey on gender bias in nlp. arXiv preprint (2021)
Google Scholar
Sun, T., et al.: Mitigating gender bias in nlp: Lit. review. In: Proceedings of ACL (2019)
Google Scholar
Verma, S., et al.: Fairness definitions explained. In: Proceedings of Software Fairness (2018)
Google Scholar
Waseem, Z., et al.: Hateful symbols or hateful people? predictive features for hate speech detection on Twitter. In: Proceedings of NAACL (2016)
Google Scholar
Webster, K., Recasens, M., Axelrod, V., Baldridge, J.: Mind the GAP: a balanced corpus of gendered ambiguous pronouns. Trans. ACL (2018)
Google Scholar
Zhao, J., Wang, T., Yatskar, M., Ordonez, V., Chang, K.W.: Gender bias in coreference resolution: Evaluation and debiasing methods. In: Proceedings of NAACL (2018)
Google Scholar
Zhao, J., et al.: Learning gender-neutral word embeddings. In: EMNLP (2018)
Google Scholar

Download references

Acknowledgements

This publication has emanated from research conducted with the financial support of Science Foundation Ireland under Grant number 18/CRT/6183. For the purpose of Open Access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission.

Author information

Authors and Affiliations

Technological University Dublin, Dublin, Ireland
Nasim Sobhani & Sarah Jane Delany

Authors

Nasim Sobhani
View author publications
You can also search for this author in PubMed Google Scholar
Sarah Jane Delany
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nasim Sobhani .

Editor information

Editors and Affiliations

Università degli Studi di Bari Aldo Moro, Bari, Italy
Michelangelo Ceci
Università della Calabria, Rende, Italy
Sergio Flesca
Università Federico II di Napoli, Naples, Italy
Elio Masciari
ICAR-CNR, Rende, Italy
Giuseppe Manco
University of North Carolina, Charlotte, NC, USA
Zbigniew W. Raś

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sobhani, N., Delany, S.J. (2022). Exploring the Impact of Gender Bias Mitigation Approaches on a Downstream Classification Task. In: Ceci, M., Flesca, S., Masciari, E., Manco, G., Raś, Z.W. (eds) Foundations of Intelligent Systems. ISMIS 2022. Lecture Notes in Computer Science(), vol 13515. Springer, Cham. https://doi.org/10.1007/978-3-031-16564-1_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-16564-1_10
Published: 26 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16563-4
Online ISBN: 978-3-031-16564-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Exploring the Impact of Gender Bias Mitigation Approaches on a Downstream Classification Task