Skip to main content

Study on the Effect of Face Masks on Forensic Speaker Recognition

Part of the Lecture Notes in Computer Science book series (LNCS,volume 13407)


The COVID-19 pandemic has led to a dramatic increase in the use of face masks. Face masks can affect both the acoustic properties of the signal and the speech patterns and have undesirable effects on automatic speech recognition systems as well as on forensic speaker recognition and identification systems. This is because the masks introduce both intrinsic and extrinsic variability into the audio signals. Moreover, their filtering effect varies depending on the type of mask used. In this paper we explore the impact of the use of different masks on the performance of an automatic speaker recognition system based on Mel Frequency Cepstral Coefficients to characterise the voices and on Support Vector Machines to perform the classification task. The results show that masks slightly affect the classification results. The effects vary depending on the type of mask used, but not as expected, as the results with FPP2 masks are better than those with surgical masks. An increase in speech intensity has been found with the FPP2 mask, which is related to the increased vocal effort made to counteract the effects of hearing loss.


  • Automatic speaker recognition
  • Acoustic features
  • Face mask
  • Forensic acoustics

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD   79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions


  1. 1.

    In Spanish, only the first two formants, F1 and F2, have the characteristics that make the difference between one vowel sound and another. This is due to the relationship between the location of the formants in the spectrogram and the position of the organs involved in articulation [27].

  2. 2.

    According to Delgado-Romero [8], “a control sample is one that belongs to a known subject, while a recovered sample is anonymous, i.e. the identity of the person who carried it out is not known”.

  3. 3.

    The corpus repository and the ASR system are available at:


  1. Atcherson, S.R., et al.: The effect of conventional and transparent surgical masks on speech understanding in individuals with and without hearing loss. J. Am. Acad. Audiol. 28, 58–67 (2017)

    CrossRef  Google Scholar 

  2. Audacity Team: Audacity (R): Free audio editor and recorder [computer application] (2022).

  3. Boersma, P., Weenink, D.: Praat: doing phonetics by computer [computer program] (version 6.2.10) (2009). Accessed 17 Mar 2022

  4. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 1–27 (2011).

    CrossRef  Google Scholar 

  5. Coniam, D.: The impact of wearing a face mask in a high-stakes oral examination: an exploratory post-SARS study in Hong Kong. Lang. Assess. Q.: Int. J. 2, 235–261 (2005)

    CrossRef  Google Scholar 

  6. Corey, R.M., Jones, U., Singer, A.C.: Acoustic effects of medical, cloth, and transparent face masks on speech signals. J. Acoust. Soc. Am. 148, 2371–2375 (2020).

    CrossRef  Google Scholar 

  7. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995).

    CrossRef  MATH  Google Scholar 

  8. Delgado-Romero, C.: La Identificación de Locutores en el Ámbito Forense (in Spanish). Ph.D. thesis, Departamento de Comunicación y Publicidad II. Facultad de Ciencias de la Información. Universidad Complutense de Madrid. España (2001)

    Google Scholar 

  9. Deller, J.R., Proakis, J.G., Hansen, J.H.L.: Discrete-Time Processing of Speech Signals. Institute of Electrical and Electronics Engineers, New York (2015)

    Google Scholar 

  10. ENFSI: Forensic speech and audio analysis working group terms of reference for forensic speaker analysis. European Network of Forensic Science Institutes, pp. 1–4 (2008)

    Google Scholar 

  11. Leu, F.Y., Lin, G.L.: An MFCC-based speaker identification system. In: IEEE 31st International Conference on Advanced Information Networking and Applications, AINA, pp. 1055–1062. Institute of Electrical and Electronics Engineers Inc. (2017).

  12. Maher, R.C.: Principles of Forensic Audio Analysis. Springer, Heidelberg (2018).

    CrossRef  Google Scholar 

  13. McFee, B., et al.: Thassilo: librosa/librosa: 0.9.1 (2022).

  14. McFee, B., et al.: Librosa: audio and music signal analysis in python. In: Proceedings of the 14th Python in Science Conference, pp. 18–24 (2015).

  15. Mendel, L.L., Gardino, J.A., Atcherson, S.R.: Speech understanding using surgical masks: a problem in health care? J. Am. Acad. Audiol. 19, 686–695 (2008)

    CrossRef  Google Scholar 

  16. Nguyen, D.D., et al.: Acoustic voice characteristics with and without wearing a facemask. Sci. Rep. 11, 1–11 (2021).

    CrossRef  Google Scholar 

  17. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  18. Pörschmann, C., Lübeck, T., Arend, J.M.: Impact of face masks on voice radiation. J. Acoust. Soc. Am. 148, 3663–3670 (2020).

    CrossRef  Google Scholar 

  19. Radonovich, L.J., Jr., Yanke, R., Cheng, J., Bender, B.: Diminished speech intelligibility associated with certain types of respirators worn by healthcare workers. J. Occup. Environ. Hyg. 7, 63–70 (2009)

    CrossRef  Google Scholar 

  20. Randazzo, M., Koenig, L.L., Priefer, R.: The effect of face masks on the intelligibility of unpredictable sentences. In: Proceedings of Meetings on Acoustics, vol. 42 (2020).

  21. Rao, K.S., Vuppala, A.K.: Speech Processing in Mobile Environments. SECE, Springer, heidelberg (2014).

    CrossRef  Google Scholar 

  22. Ratha, N.K., Connell, J.H., Bolle, R.M.: Enhancing security and privacy in biometrics-based authentication systems. IBM Syst. J. 40(3), 614–634 (2001)

    CrossRef  Google Scholar 

  23. Ribeiro, V., Dassie-Leite, A.P., Pereira, E.C., Santos, A.D.N., Martins, P., de Irineu, R.: Effect of wearing a face mask on vocal self-perception during a pandemic. J. Voice (2020)

    Google Scholar 

  24. Saeidi, R., Huhtakallio, I., Alku, P.: Analysis of face mask effect on speaker recognition. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 08, pp. 1800–1804 (2016).

  25. Saeidi, R., Niemi, T., Karppelin, H., Pohjalainen, J., Kinnunen, T., Alku, P.: Speaker recognition for speech under face cover. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2015-January, pp. 1012–1016 (2015).

  26. Saleem, S., Subhan, F., Naseer, N., Bais, A., Imtiaz, A.: Forensic speaker recognition: a new method based on extracting accent and language information from short utterances. Forensic Sci. Int.: Digital Invest. 34, 300982 (2020)

    Google Scholar 

  27. Sánchez-López, D.: Análisis acústico y sonográfico de la vocal /a/ para su aplicación en el ámbito de las ciencias forenses (2016). (in Spanish)

  28. Wainer, J., Fonseca, P.: How to tune the RBF SVM hyperparameters? An empirical evaluation of 18 search algorithms. Artif. Intell. Rev. 54, 4771–4797 (2021)

    CrossRef  Google Scholar 

  29. Wu, Z., Evans, N., Kinnunen, T., Yamagishi, J., Alegre, F., Li, H.: Spoofing and countermeasures for speaker verification: a survey. Speech Commun. 66, 130–153 (2015)

    CrossRef  Google Scholar 

Download references


This research was supported by the Research Grants Program of the Universidad de Alcalá. We acknowledge the valuable counsel and resources provided by G. A. Acha Ruiz, as well as to the Department of Forensic Acoustics of the “Comisaría General de Policía Científica” for the access to the LOCUPOL database sentences.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Hilario Gómez-Moreno .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bogdanel, G., Belghazi-Mohamed, N., Gómez-Moreno, H., Lafuente-Arroyo, S. (2022). Study on the Effect of Face Masks on Forensic Speaker Recognition. In: Alcaraz, C., Chen, L., Li, S., Samarati, P. (eds) Information and Communications Security. ICICS 2022. Lecture Notes in Computer Science, vol 13407. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15776-9

  • Online ISBN: 978-3-031-15777-6

  • eBook Packages: Computer ScienceComputer Science (R0)