Abstract
Large pre-trained language models are successfully being used in a variety of tasks, across many languages. With this ever-increasing usage, the risk of harmful side effects also rises, for example by reproducing and reinforcing stereotypes. However, detecting and mitigating these harms is difficult to do in general and becomes computationally expensive when tackling multiple languages or when considering different biases. To address this, we present FairDistillation: a cross-lingual method based on knowledge distillation to construct smaller language models while controlling for specific biases. We found that our distillation method does not negatively affect the downstream performance on most tasks and successfully mitigates stereotyping and representational harms. We demonstrate that FairDistillation can create fairer language models at a considerably lower cost than alternative approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The original BERT corpus is a concatenation of Wikipedia and the Toronto BookcorpusĀ [14].
- 2.
- 3.
- 4.
- 5.
References
Bartl, M., Nissim, M., Gatt, A.: Unmasking contextual stereotypes: measuring and mitigating BERTās gender bias. arXiv:2010.14534 [cs] (2020)
Bender, E.M., Gebru, T., McMillan-Major, A., Shmitchell, S.: On the dangers of stochastic parrots: can language models be too big? In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pp. 610ā623 (2021)
Blodgett, S.L., Barocas, S., DaumĆ© III, H., Wallach, H.: Language (technology) is power: a critical survey of ābiasā in NLP. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5454ā5476, Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.485, https://www.aclweb.org/anthology/2020.acl-main.485
Blodgett, S.L., Lopez, G., Olteanu, A., Sim, R., Wallach, H.: Stereotyping Norwegian salmon: an inventory of pitfalls in fairness benchmark datasets. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1004ā1015 (2021)
Bolukbasi, T., Chang, K.W., Zou, J., Saligrama, V., Kalai, A.: Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. arXiv:1607.06520 [cs, stat] (2016)
Bresnan, J., Kaplan, R.M., Peters, S., Zaenen, A.: Cross-serial dependencies in Dutch. In: Savitch, W.J., Bach, E., Marsh, W., Safran-Naveh, G. (eds.) The Formal Complexity of Natural Language. Studies in Linguistics and Philosophy, vol. 33, pp. 286ā319. Springer, Dordrecht (1982). https://doi.org/10.1007/978-94-009-3401-6_11
BuciluĒ, C., Caruana, R., Niculescu-Mizil, A.: Model compression. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 535ā541 (2006)
Caliskan, A., Bryson, J.J., Narayanan, A.: Semantics derived automatically from language corpora contain human-like biases. Science 356(6334), 183ā186 (2017). https://doi.org/10.1126/science.aal4230, https://science.sciencemag.org/content/356/6334/183. ISSN 0036-8075
Conneau, A., et al.: XNLI: evaluating cross-lingual sentence representations. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2018)
De Raedt, L., Kimmig, A., Toivonen, H.: ProbLog: a probabilistic prolog and its application in link discovery. In: IJCAI, Hyderabad, vol. 7, pp. 2462ā2467 (2007)
Delobelle, P., Tokpo, E., Calders, T., Berendt, B.: Measuring fairness with biased rulers: a comparative study on bias metrics for pre-trained language models. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1693ā1706. Association for Computational Linguistics, Seattle (2022)
Delobelle, P., Winters, T., Berendt, B.: RobBERT: a dutch RoBERTa-based language model. In: Findings of ACL: EMNLP 2020 (2020)
Delobelle, P., Winters, T., Berendt, B.: RobBERTje: a distilled Dutch BERT model. Comput. Linguist. Netherlands J. 11, 125ā140 (2022). https://www.clinjournal.org/clinj/article/view/131
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171ā4186. Association for Computational Linguistics, Minneapolis (2019). https://doi.org/10.18653/v1/N19-1423
Feng, S.Y., et al: A survey of data augmentation approaches for NLP. arXiv preprint arXiv:2105.03075 (2021)
Hall Maudslay, R., Gonen, H., Cotterell, R., Teufel, S.: Itās all in the name: mitigating gender bias with name-based counterfactual data substitution. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 5267ā5275. Association for Computational Linguistics, Hong Kong (2019). https://doi.org/10.18653/v1/D19-1530, https://www.aclweb.org/anthology/D19-1530
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv:1503.02531 [cs, stat] (2015)
Houlsby, N., et al.: Parameter-efficient transfer learning for NLP. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 97, pp. 2790ā2799. PMLR (2019). https://proceedings.mlr.press/v97/houlsby19a.html
Jiao, X., et al.: TinyBERT: distilling BERT for natural language understanding. In: Findings of ACL: EMNLP 2020 (2020)
Kurita, K., Vyas, N., Pareek, A., Black, A.W., Tsvetkov, Y.: Measuring bias in contextualized word representations. In: Proceedings of the First Workshop on Gender Bias in Natural Language Processing, pp. 166ā172. Association for Computational Linguistics, Florence (2019). https://doi.org/10.18653/v1/W19-3823, https://www.aclweb.org/anthology/W19-3823
Lauscher, A., Lueken, T., GlavaÅ”, G.: Sustainable modular debiasing of language models. In: Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 4782ā4797. Association for Computational Linguistics, Punta Cana (2021). https://doi.org/10.18653/v1/2021.findings-emnlp.411, https://aclanthology.org/2021.findings-emnlp.411
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv:1907.11692 [cs] (2019)
Lu, K., Mardziel, P., Wu, F., Amancharla, P., Datta, A.: Gender bias in neural natural language processing. In: Nigam, V., et al. (eds.) Logic, Language, and Security. LNCS, vol. 12300, pp. 189ā202. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62077-6_14. ISBN 978-3-030-62077-6
Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 142ā150. Association for Computational Linguistics, Portland (2011). http://www.aclweb.org/anthology/P11-1015
Martin, L., et al.: CamemBERT: a tasty french language model. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7203ā7219. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.645
May, C., Wang, A., Bordia, S., Bowman, S.R., Rudinger, R.: On measuring social biases in sentence encoders. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 622ā628. Association for Computational Linguistics, Minneapolis (2019). https://doi.org/10.18653/v1/N19-1063
Ortiz SuĆ”rez, P.J., Romary, L., Sagot, B.: A monolingual approach to contextualized word embeddings for mid-resource languages. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 1703ā1714. Association for Computational Linguistics (2020). https://www.aclweb.org/anthology/2020.acl-main.156
Papakipos, Z., Bitton, J.: AugLy: data augmentations for robustness. arXiv preprint arXiv:2201.06494 (2022)
Pfeiffer, J., Kamath, A., RĆ¼cklĆ©, A., Cho, K., Gurevych, I.: AdapterFusion: non-destructive task composition for transfer learning. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 487ā503. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.eacl-main.39, https://aclanthology.org/2021.eacl-main.39
Pfeiffer, J., et al.: AdapterHub: a framework for adapting transformers. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 46ā54. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.emnlp-demos.7, https://aclanthology.org/2020.emnlp-demos.7
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. In: NeurIPS EMC\(^2\) Workshop (2019)
Tan, Y.C., Celis, L.E.: Assessing social and intersectional biases in contextualized word representations. In: Wallach, H., Larochelle, H., Beygelzimer, A., dAlchĆ©-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 13230ā13241. Curran Associates, Inc. (2019)
van der Burgh, B., Verberne, S.: The merits of universal language model fine-tuning for small datasets - a case with dutch book reviews. arXiv:1910.00896 [cs] (2019)
Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 5998ā6008. Curran Associates, Inc. (2017)
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.R.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. In: the Proceedings of ICLR (2019)
Webster, K., et al.: Measuring and reducing gendered correlations in pre-trained models. arXiv:2010.06032 [cs] (2020)
Wijnholds, G., Moortgat, M.: SICKNL: a dataset for dutch natural language inference. arXiv preprint arXiv:2101.05716 (2021)
Acknowledgements
Pieter Delobelle was supported by the Research Foundation - Flanders (FWO) under EOS No. 30992574 (VeriLearn) and received funding from the Flemish Government under the āOnderzoeksprogramma ArtificiĆ«le Intelligentie (AI) Vlaanderenā programme. Bettina Berendt received funding from the German Federal Ministry of Education and Research (BMBF) - Nr. 16DII113. Some resources and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by the Research Foundation - Flanders (FWO) and the Flemish Government.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Hyperparameters
A Hyperparameters
Rights and permissions
Copyright information
Ā© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Delobelle, P., Berendt, B. (2023). FairDistillation: Mitigating Stereotyping inĀ Language Models. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13714. Springer, Cham. https://doi.org/10.1007/978-3-031-26390-3_37
Download citation
DOI: https://doi.org/10.1007/978-3-031-26390-3_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26389-7
Online ISBN: 978-3-031-26390-3
eBook Packages: Computer ScienceComputer Science (R0)