Skip to main content

FairDistillation: Mitigating Stereotyping inĀ Language Models

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13714))

Abstract

Large pre-trained language models are successfully being used in a variety of tasks, across many languages. With this ever-increasing usage, the risk of harmful side effects also rises, for example by reproducing and reinforcing stereotypes. However, detecting and mitigating these harms is difficult to do in general and becomes computationally expensive when tackling multiple languages or when considering different biases. To address this, we present FairDistillation: a cross-lingual method based on knowledge distillation to construct smaller language models while controlling for specific biases. We found that our distillation method does not negatively affect the downstream performance on most tasks and successfully mitigates stereotyping and representational harms. We demonstrate that FairDistillation can create fairer language models at a considerably lower cost than alternative approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The original BERT corpus is a concatenation of Wikipedia and the Toronto BookcorpusĀ [14].

  2. 2.

    https://raw.githubusercontent.com/facebookresearch/AugLy/main/augly/assets/text/gendered_words_mapping.json.

  3. 3.

    https://github.com/keitakurita/contextual_embedding_bias_measure/blob/master/notebooks/data/employeesalaries2017.csv.

  4. 4.

    https://oscar-corpus.com.

  5. 5.

    https://people.cs.kuleuven.be/~pieter.delobelle/data.html.

References

  1. Bartl, M., Nissim, M., Gatt, A.: Unmasking contextual stereotypes: measuring and mitigating BERTā€™s gender bias. arXiv:2010.14534 [cs] (2020)

  2. Bender, E.M., Gebru, T., McMillan-Major, A., Shmitchell, S.: On the dangers of stochastic parrots: can language models be too big? In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pp. 610ā€“623 (2021)

    Google ScholarĀ 

  3. Blodgett, S.L., Barocas, S., DaumĆ© III, H., Wallach, H.: Language (technology) is power: a critical survey of ā€œbiasā€ in NLP. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5454ā€“5476, Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.485, https://www.aclweb.org/anthology/2020.acl-main.485

  4. Blodgett, S.L., Lopez, G., Olteanu, A., Sim, R., Wallach, H.: Stereotyping Norwegian salmon: an inventory of pitfalls in fairness benchmark datasets. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1004ā€“1015 (2021)

    Google ScholarĀ 

  5. Bolukbasi, T., Chang, K.W., Zou, J., Saligrama, V., Kalai, A.: Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. arXiv:1607.06520 [cs, stat] (2016)

  6. Bresnan, J., Kaplan, R.M., Peters, S., Zaenen, A.: Cross-serial dependencies in Dutch. In: Savitch, W.J., Bach, E., Marsh, W., Safran-Naveh, G. (eds.) The Formal Complexity of Natural Language. Studies in Linguistics and Philosophy, vol. 33, pp. 286ā€“319. Springer, Dordrecht (1982). https://doi.org/10.1007/978-94-009-3401-6_11

    ChapterĀ  Google ScholarĀ 

  7. BuciluĒŽ, C., Caruana, R., Niculescu-Mizil, A.: Model compression. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 535ā€“541 (2006)

    Google ScholarĀ 

  8. Caliskan, A., Bryson, J.J., Narayanan, A.: Semantics derived automatically from language corpora contain human-like biases. Science 356(6334), 183ā€“186 (2017). https://doi.org/10.1126/science.aal4230, https://science.sciencemag.org/content/356/6334/183. ISSN 0036-8075

  9. Conneau, A., et al.: XNLI: evaluating cross-lingual sentence representations. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2018)

    Google ScholarĀ 

  10. De Raedt, L., Kimmig, A., Toivonen, H.: ProbLog: a probabilistic prolog and its application in link discovery. In: IJCAI, Hyderabad, vol. 7, pp. 2462ā€“2467 (2007)

    Google ScholarĀ 

  11. Delobelle, P., Tokpo, E., Calders, T., Berendt, B.: Measuring fairness with biased rulers: a comparative study on bias metrics for pre-trained language models. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1693ā€“1706. Association for Computational Linguistics, Seattle (2022)

    Google ScholarĀ 

  12. Delobelle, P., Winters, T., Berendt, B.: RobBERT: a dutch RoBERTa-based language model. In: Findings of ACL: EMNLP 2020 (2020)

    Google ScholarĀ 

  13. Delobelle, P., Winters, T., Berendt, B.: RobBERTje: a distilled Dutch BERT model. Comput. Linguist. Netherlands J. 11, 125ā€“140 (2022). https://www.clinjournal.org/clinj/article/view/131

  14. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171ā€“4186. Association for Computational Linguistics, Minneapolis (2019). https://doi.org/10.18653/v1/N19-1423

  15. Feng, S.Y., et al: A survey of data augmentation approaches for NLP. arXiv preprint arXiv:2105.03075 (2021)

  16. Hall Maudslay, R., Gonen, H., Cotterell, R., Teufel, S.: Itā€™s all in the name: mitigating gender bias with name-based counterfactual data substitution. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 5267ā€“5275. Association for Computational Linguistics, Hong Kong (2019). https://doi.org/10.18653/v1/D19-1530, https://www.aclweb.org/anthology/D19-1530

  17. Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv:1503.02531 [cs, stat] (2015)

  18. Houlsby, N., et al.: Parameter-efficient transfer learning for NLP. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 97, pp. 2790ā€“2799. PMLR (2019). https://proceedings.mlr.press/v97/houlsby19a.html

  19. Jiao, X., et al.: TinyBERT: distilling BERT for natural language understanding. In: Findings of ACL: EMNLP 2020 (2020)

    Google ScholarĀ 

  20. Kurita, K., Vyas, N., Pareek, A., Black, A.W., Tsvetkov, Y.: Measuring bias in contextualized word representations. In: Proceedings of the First Workshop on Gender Bias in Natural Language Processing, pp. 166ā€“172. Association for Computational Linguistics, Florence (2019). https://doi.org/10.18653/v1/W19-3823, https://www.aclweb.org/anthology/W19-3823

  21. Lauscher, A., Lueken, T., GlavaÅ”, G.: Sustainable modular debiasing of language models. In: Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 4782ā€“4797. Association for Computational Linguistics, Punta Cana (2021). https://doi.org/10.18653/v1/2021.findings-emnlp.411, https://aclanthology.org/2021.findings-emnlp.411

  22. Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv:1907.11692 [cs] (2019)

  23. Lu, K., Mardziel, P., Wu, F., Amancharla, P., Datta, A.: Gender bias in neural natural language processing. In: Nigam, V., et al. (eds.) Logic, Language, and Security. LNCS, vol. 12300, pp. 189ā€“202. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62077-6_14. ISBN 978-3-030-62077-6

  24. Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 142ā€“150. Association for Computational Linguistics, Portland (2011). http://www.aclweb.org/anthology/P11-1015

  25. Martin, L., et al.: CamemBERT: a tasty french language model. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7203ā€“7219. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.645

  26. May, C., Wang, A., Bordia, S., Bowman, S.R., Rudinger, R.: On measuring social biases in sentence encoders. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 622ā€“628. Association for Computational Linguistics, Minneapolis (2019). https://doi.org/10.18653/v1/N19-1063

  27. Ortiz SuĆ”rez, P.J., Romary, L., Sagot, B.: A monolingual approach to contextualized word embeddings for mid-resource languages. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 1703ā€“1714. Association for Computational Linguistics (2020). https://www.aclweb.org/anthology/2020.acl-main.156

  28. Papakipos, Z., Bitton, J.: AugLy: data augmentations for robustness. arXiv preprint arXiv:2201.06494 (2022)

  29. Pfeiffer, J., Kamath, A., RĆ¼cklĆ©, A., Cho, K., Gurevych, I.: AdapterFusion: non-destructive task composition for transfer learning. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 487ā€“503. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.eacl-main.39, https://aclanthology.org/2021.eacl-main.39

  30. Pfeiffer, J., et al.: AdapterHub: a framework for adapting transformers. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 46ā€“54. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.emnlp-demos.7, https://aclanthology.org/2020.emnlp-demos.7

  31. Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. In: NeurIPS EMC\(^2\) Workshop (2019)

    Google ScholarĀ 

  32. Tan, Y.C., Celis, L.E.: Assessing social and intersectional biases in contextualized word representations. In: Wallach, H., Larochelle, H., Beygelzimer, A., dAlchĆ©-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 13230ā€“13241. Curran Associates, Inc. (2019)

    Google ScholarĀ 

  33. van der Burgh, B., Verberne, S.: The merits of universal language model fine-tuning for small datasets - a case with dutch book reviews. arXiv:1910.00896 [cs] (2019)

  34. Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 5998ā€“6008. Curran Associates, Inc. (2017)

    Google ScholarĀ 

  35. Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.R.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. In: the Proceedings of ICLR (2019)

    Google ScholarĀ 

  36. Webster, K., et al.: Measuring and reducing gendered correlations in pre-trained models. arXiv:2010.06032 [cs] (2020)

  37. Wijnholds, G., Moortgat, M.: SICKNL: a dataset for dutch natural language inference. arXiv preprint arXiv:2101.05716 (2021)

Download references

Acknowledgements

Pieter Delobelle was supported by the Research Foundation - Flanders (FWO) under EOS No. 30992574 (VeriLearn) and received funding from the Flemish Government under the ā€œOnderzoeksprogramma ArtificiĆ«le Intelligentie (AI) Vlaanderenā€ programme. Bettina Berendt received funding from the German Federal Ministry of Education and Research (BMBF) - Nr. 16DII113. Some resources and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by the Research Foundation - Flanders (FWO) and the Flemish Government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pieter Delobelle .

Editor information

Editors and Affiliations

A Hyperparameters

A Hyperparameters

Table 3. The hyperparameter space used for finetuning.

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Delobelle, P., Berendt, B. (2023). FairDistillation: Mitigating Stereotyping inĀ Language Models. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13714. Springer, Cham. https://doi.org/10.1007/978-3-031-26390-3_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-26390-3_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-26389-7

  • Online ISBN: 978-3-031-26390-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics