Voice Privacy in Biometrics

Gupta, Priyanka; Singh, Shrishti; Prajapati, Gauri P.; Patil, Hemant A.

doi:10.1007/978-3-031-15816-2_1

Priyanka Gupta⁹,
Shrishti Singh⁹,
Gauri P. Prajapati⁹ &
…
Hemant A. Patil⁹

Part of the book series: EAI/Springer Innovations in Communication and Computing ((EAISICC))

348 Accesses
1 Citations

Abstract

Attempts to spoof an ASV (voice biometric system) have been successful in the past due to the advent of technologies. However, despite the development of various countermeasures for each spoofing attack, there is an urgent need for a versatile countermeasure. Hence, designing a voice privacy system has become crucial. Moreover, the energy losses in a speech production model contain speaker-specific information and thus provide acoustic cues for voice privacy. In this chapter, the design of 2nd-order resonator and the linear prediction modeling of speech production are exploited to design voice privacy system. The performance of the proposed system is compared with the secondary baseline system of the INTERSPEECH 2020 voice privacy challenge. Improved performance-wise EER and WER are achieved for various subsets of the corpora, furthermore, while we may achieve anonymization by cryptography, which have limitations in complexity and implementation costs, discussed in detail for privacy preservation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Warren, S.D. and Brandeis, L.D. (1890) The Right to Privacy. Harvard Law Review : 193–220.
Google Scholar
Nautsch, A., Jiménez, A., Treiber, A., Kolberg, J., Jasserand, C., Kindt, E., Delgado, H. et al. (2019) Preserving Privacy in Speaker and Speech Characterisation. Computer Speech & Language 58: 441–480.
Google Scholar
Malin, B.A., Emam, K.E. and O’Keefe, C.M. (2013), Biomedical data privacy: problems, perspectives, and recent advances.
Google Scholar
Boyer, B.B. (1975) Computerized medical records and the right to privacy: the emerging federal response. BuFF. L. REv. 25: 37.
Google Scholar
Stylianou, Y., Cappé, O. and Moulines, E. (1998) Continuous probabilistic transform for voice conversion. IEEE Transactions on Speech and Audio Processing 6(2): 131–142.
Google Scholar
Stylianou, Y. (2009) Voice transformation: A survey. In International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (Taipei, Taiwan): 3585–3588.
Google Scholar
Zen, H., Tokuda, K. and Black, A.W. (2009) Statistical parametric speech synthesis. Speech Communication 51(11): 1039–1064.
Google Scholar
De Leon, P.L., Pucher, M., Yamagishi, J., Hernaez, I. and Saratxaga, I. (October, 2012) Evaluation of speaker verification security and detection of HMM-based synthetic speech. IEEE Transactions on Audio, Speech, and Language Processing 20(8): 2280–2290.
Google Scholar
Alegre, F., Janicki, A. and Evans, N. (2014) Re-assessing the threat of replay spoofing attacks against automatic speaker verification. In International Conference of the Biometrics Special Interest Group (BIOSIG) (Darmstadt, Germany): 1–6.
Google Scholar
Paul, A., Das, R.K., Sinha, R. and Prasanna, S.M. (2016) Countermeasure to handle replay attacks in practical speaker verification systems. In 2016 International Conference on Signal Processing and Communications (SPCOM) (IISc, Bengaluru, India): 1–5.
Google Scholar
Prajapati, G.P., , Kamble, M.R. and Patil, H.A. (18-21 January, 2020) Energy separation based features for replay spoof detection for voice assistant. 28th European Signal Processing Conference (EUSIPCO) : pp. 386–390.
Google Scholar
Wu, Z., Evans, N., Kinnunen, T., Yamagishi, J., Alegre, F. and Li, H. (2015) Spoofing and countermeasures for speaker verification: A survey. Speech Communication 66: 130–153.
Google Scholar
Lau, Y.W., Wagner, M. and Tran, D. (2004) Vulnerability of speaker verification to voice mimicking. In International Symposium on Intelligent Multimedia, Video, and Speech Processing (Hong Kong): 145–148.
Google Scholar
Gupta, P., Prajapati, G.P., Singh, S., Kamble, M.R. and Patil, H.A. (7-10 December, 2020) Design of voice privacy system using linear prediction. In Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) (Auckland, New Zealand: IEEE): 543–549.
Google Scholar
Gong, Y., Yang, J. and Poellabauer, C. (2020) Detecting Replay Attacks Using Multi-Channel Audio: A Neural Network-Based Method. IEEE Signal Processing Letters.
Google Scholar
Patel, T.B. and Patil, H.A. (2016) Cochlear Filter and Instantaneous Frequency based Features for Spoofed Speech Detection. IEEE Journal of Selected Topics in Signal Processing 11(4): 618–631.
Google Scholar
Patel, T.B. and Patil, H.A. (6-10 September, 2015) Combining Evidences from Mel Cepstral, Cochlear Filter Cepstral and Instantaneous Frequency Features for Detection of Natural vs. Spoofed Speech. In INTERSPEECH (Dresden, Germany).
Google Scholar
Kamble, M.R., Pulikonda, A.K.S., Krishna, M.V.S. and Patil, H.A. (1-5 November, 2020) Analysis of Teager Energy Profiles for Spoof Speech Detection. In Odyssey The Speaker and Language Recognition Workshop, Tokyo, Japan.
Google Scholar
Zhizheng, W., Kinnunen, T., Evans, N., Yamagishi, J., Hanilçi, C., Sahidullah, M. and Sizov, A. (6-10 September, 2015) ASVspoof 2015: The First Automatic Speaker Verification Spoofing and Countermeasures Challenge. In INTERSPEECH (Dresden, Germany): 2037–2041.
Google Scholar
Todisco, M., Wang, X., Vestman, V., Sahidullah, M., Delgado, H., Nautsch, A., Yamagishi, J. et al. (2019) Asvspoof 2019: Future Horizons in Spoofed and Fake Audio Detection. arXiv preprint arXiv:1904.05441 .
Google Scholar
Automatic Speaker Verification-Spoofing and Countermeasures Challenge https://www.asvspoof.org/. {Last Accessed: 2021-03-15}.
Novoselov, S., Kozlov, A., Lavrentyeva, G., Simonchik, K. and Shchemelinin, V. (20-25 March, 2016) STC Anti-spoofing systems for the ASVspoof 2015 Challenge. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (Shanghai, China: IEEE): 5475–5479.
Google Scholar
Wester, M., Wu, Z. and Yamagishi, J. (6-10 September, 2015) Human vs Machine Spoofing Detection on Wideband and Narrowband Data. In INTERSPEECH (Dresden, Germany): 2047–2051.
Google Scholar
Wang, L., Yoshida, Y., Kawakami, Y. and Nakagawa, S. (6-10 September, 2015) Relative Phase Information for Detecting Human Speech and Spoofed Speech. In INTERSPEECH (Dresden, Germany): 2092–2096.
Google Scholar
Liu, Y., Tian, Y., He, L., Liu, J. and Johnson, M.T. (6-10 September, 2015) Simultaneous Utilization of Spectral Magnitude and Phase Information to Extract Supervectors for Speaker Verification Anti-spoofing. In INTERSPEECH (Dresden, Germany): 2082–2086.
Google Scholar
Xiao, X., Tian, X., Du, S., Xu, H., Chng, E.S. and Li, H. (6-10 September, 2015) Spoofing Speech Detection using High-Dimensional Magnitude and Phase Features: The NTU Approach for ASVspoof 2015 Challenge. In INTERSPEECH (Dresden, Germany): 2052–2056.
Google Scholar
Font, R., Espín, J.M. and Cano, M.J. (20-24 August, 2017) Experimental Analysis of Features for Replay Attack Detection-Results on the ASVspoof 2017 Challenge. In INTERSPEECH (Stockholm, Sweden): 7–11.
Google Scholar
Witkowski, M., Kacprzak, S., Zelasko, P., Kowalczyk, K. and Galka, J. (20-24 August, 2017) Audio Replay Attack Detection Using High-Frequency Features. In INTERSPEECH (Stockholm, Sweden): 27–31.
Google Scholar
Wang, X., Xiao, Y. and Zhu, X. (20-24 August, 2017) Feature selection based on CQCCs for automatic speaker verification spoofing. In INTERSPEECH (Stockholm, Sweden): 32–36.
Google Scholar
Doddington, G., Liggett, W., Martin, A., Przybocki, M. and Reynolds, D. (1998) Sheep, Goats, Lambs and Wolves: A Statistical Analysis of Speaker Performance. Tech. rep., National Institute of Standards and Technology (NIST), Gaithersburg Md.
Google Scholar
Gupta, P. and Patil, H.A. (2021, Brno, Czechia) A Survey of Attacker’s Perspective on Automatic Speaker Verification (ASV) Systems. Submitted to INTERSPEECH 2021 .
Google Scholar
(2017) HSBC reports high trust levels in biometric tech as twins spoof its voice id system. Biometric Technology Today 2017(6): 12. http://www.sciencedirect.com/science/article/pii/S0969476517301194. {Last Accessed: 2021-03-15}.
Team, E. (2017), Twins fool HSBC voice biometrics - BBC. https://www.finextra.com/newsarticle/30594/twins-fool-hsbc-voice-biometrics--bbc. {last accessed: 2021-03-15}.
Rosenberg, A.E. (1976) Automatic speaker verification: A review. Proceedings of the IEEE 64(4): 475–487.
Article Google Scholar
Quatieri, T.F. (2004) Discrete-Time Speech Signal Processing: Principles and Practice (2nd Edition, Pearson Education India).
Google Scholar
Kersta, L.G. (1962) Voiceprint identification. Nature 196(4861): 1253–1257.
Article Google Scholar
Fant, G. (1970) Acoustic Theory of Speech Production (2nd Edition, Walter de Gruyter).
Google Scholar
Atal, B.S. and Hanauer, S.L. (1971) Speech Analysis and Synthesis by Linear Prediction of the Speech Wave. The Journal of the Acoustical Society of America (JASA) 50(2B): 637–655.
Google Scholar
Flanagan, J.L. (2013) Speech Analysis Synthesis and Perception, 3 (Springer Science & Business Media).
Google Scholar
Portnoff, M.R. (1973) A Quasi-One-Dimensional Digital Simulation for the Time-Varying Vocal Tract. Ph.D. thesis, Department of Electrical Engineering, Massachusetts Institute of Technology, USA.
Google Scholar
Markel, J.D. and Gray, A.J. (2013) Linear Prediction of Speech, 12 (Springer Science & Business Media).
Google Scholar
Eide, E. and Gish, H. (1996) A Parametric Approach to Vocal Tract Length Normalization. In International Conference on Acoustics, Speech and Signal Processing (ICASSP) (Atlanta, Georgia, USA: IEEE), 1: 346–348.
Google Scholar
Mizuno, H. and Abe, M. (1996) A Formant Frequency Modification Algorithm Dealing with the Pole Interaction. Electronics and Communications in Japan (Part III: Fundamental Electronic Science) 79(1): 46–55.
Google Scholar
Schroeder, M.R. (May 1966) Vocoders: Analysis and Synthesis of Speech. Proceedings of the IEEE 54(5): 720–734.
Article Google Scholar
The Voice Privacy 2020 Challenge Evaluation Plan. https://www.voiceprivacychallenge.org.
Tomashenko, N., Srivastava, B.M.L., Wang, X., Vincent, E., Nautsch, A., Yamagishi, J., Evans, N. et al. (24-28 October, 2020) Introducing the voice privacy initiative. In INTERSPEECH (Shanghai, China). {Last Accessed: 2021-03-15}.
Google Scholar
McAdams, S. (May, 1984) Spectral fusion, spectral parsing, and the formation of auditory image. Ph.D. Thesis, Department of Hearing and Speech, Stanford University, California, USA .
Google Scholar
Patino, J., Todisco, M., Nautsch, A. and Evans, N. (2020) Speaker Anonymisation using the McAdam’s Coefficient. Tech. rep., EURECOM. http://www.eurecom.fr/publication/6190 Last Accessed: 2021-03-15.
Panayotov, V., Chen, G., Povey, D. and Khudanpur, S. (19-24 April, 2015) LibriSpeech: an ASR corpus based on public domain audio books. In 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (Brisbane, Australia: IEEE): 5206–5210.
Google Scholar
Yamagishi, J., Veaux, C., MacDonald, K. et al. (2019) CSTR VCTK Corpus: English Multi-Speaker Corpus for CSTR Voice Cloning Toolkit (Version 0.92) .
Google Scholar
Slifka, J. and Anderson, T.R. (1995) Speaker Modification with LPC Pole Analysis. In 1995 International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (Detroit, Michigan, USA: IEEE), 1: 644–647.
Google Scholar
Un, C. and Magill, D. (1975) The residual-excited linear prediction vocoder with transmission rate below 9.6 kbits/s. IEEE Transactions on Communications 23(12): 1466–1474.
Google Scholar
Schroeder, M. and Atal, B. (1985) Code-excited linear prediction (CELP): High-quality speech at very low bit rates. In ICASSP’85. IEEE International Conference on Acoustics, Speech, and Signal Processing (IEEE), 10: 937–940.
Google Scholar
McCree, A.V. and Barnwell, T.P. (1995) A mixed excitation LPC vocoder model for low bit rate speech coding. IEEE Transactions on Speech and Audio Processing 3(4): 242–250.
Google Scholar
Gupta, P., Prajapati, G., Singh, S., Kamble, M.R. and Patil, H.A. (2020) System description : Design of voice privacy system using linear prediction https://www.voiceprivacychallenge.org/docs/DA-IICT-Speech-Group.pdf. {Last Accessed: 15-01-2021}.
Patil, H.A., Dutta, P. and Basu, T. (2006) On the Investigation of Spectral Resolution Problem for Identification of Female Speakers in Bengali. In 2006 IEEE International Conference on Industrial Technology (ICIT) (Mumbai, India: IEEE): 375–380.
Google Scholar
Sailor, H.B. (2013) Objective Evaluation of Speech Quality of Text-to-Speech (TTS) Synthesis Systems. Master’s thesis, DA-IICT, Gandhinagar, India.
Google Scholar
Stinson, D.R. and Paterson, M. (2018) Cryptography: Theory and Practice (CRC press).
Google Scholar
Stallings, W. (2006) Cryptography and Network Security: Principles and Practices (Pearson Education India).
Google Scholar
Rivest, R.L., Shamir, A. and Adleman, L. (1978) A Method for Obtaining Digital Signatures and Public-Key Cryptosystems. Communications of the ACM 21(2): 120–126.
Google Scholar
Bai, X., Jiang, L., Liu, X. and Tan, J. (2014) RSA Encryption/Decryption Implementation Based on ZedBoard. In International Conference on Trustworthy Computing and Services (Springer): 114–121.
Google Scholar
Dixon, J.D. (1970) The Number of Steps in the Euclidean Algorithm. Journal of Number Theory 2(4): 414–422.
Article MathSciNet MATH Google Scholar
Gentry, C. and Boneh, D. (2009) A Fully Homomorphic Encryption Scheme, 20 (Stanford University).
Google Scholar
Nara, R., Satoh, K., Yanagisawa, M., Ohtsuki, T. and Togawa, N. (2010) Scan-based Side-Channel Attack Against RSA Cryptosystems Using Scan Signatures. IEICE transactions on Fundamentals of Electronics, Communications and Computer Sciences 93(12): 2481–2489.
Google Scholar

Download references

Author information

Authors and Affiliations

Dhirubhai Ambani Institute of Information and Communication Technology (DA-IICT), Gandhinagar, India
Priyanka Gupta, Shrishti Singh, Gauri P. Prajapati & Hemant A. Patil

Authors

Priyanka Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Shrishti Singh
View author publications
You can also search for this author in PubMed Google Scholar
Gauri P. Prajapati
View author publications
You can also search for this author in PubMed Google Scholar
Hemant A. Patil
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gauri P. Prajapati .

Editor information

Editors and Affiliations

Electronics & Communication Engineering, Sarvajanik College of Engineering and Technology, Surat, India
Chirag Paunwala
Electronics & Communication Engineering, C. K. Pithawala College of Engineering and Technology, Surat, India
Mita Paunwala
Electronics & Communication Engineering, G. H. Patel College of Engineering & Technology, Vallabh Vidyanagar, Gujarat, India
Rahul Kher
Electronics & Communication Engineering, G. H. Patel College of Engineering & Technology, Vallabh Vidyanagar, Gujarat, India
Falgun Thakkar
A. D. Patel Institute of Technology, New Vallabh Vidyanagar, India
Heena Kher
School of Computer Science, University of Oklahoma, Norman, OK, USA
Mohammed Atiquzzaman
UTM Razak School, Menara Razak, Universiti Teknologi Malaysia, Kuala Lumpur, Malaysia
Norliza Mohd. Noor

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gupta, P., Singh, S., Prajapati, G.P., Patil, H.A. (2023). Voice Privacy in Biometrics. In: Paunwala, C., et al. Biomedical Signal and Image Processing with Artificial Intelligence. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-031-15816-2_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-15816-2_1
Published: 13 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15815-5
Online ISBN: 978-3-031-15816-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics