Two-Layer Perceptron for Voice Recognition of Speaker’s Identity

Hu, Zhengbing; Tereikovskyi, Ihor; Korystin, Oleksandr; Mihaylenko, Victor; Tereikovska, Liudmyla

doi:10.1007/978-3-030-55506-1_46

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1247))

Included in the following conference series:

International Conference on Computer Science, Engineering and Education Applications

480 Accesses

Abstract

The article is devoted to the problem of ensuring reliable authentication of users of information systems for various purposes. The prospects of solving this problem through the use of voice signal analysis tools to recognize the speaker’s personality are shown. The main advantages of such tools include the increased durability of the biometric access code, the use of common registration tools, as well as the possibility of implementation of hidden monitoring of the user’s identity. The relevance of research in the direction of developing low-resource means of recognizing the speaker’s personality by voice fragments of a fixed duration, using only available computing power on the spot, is substantiated. Based on the analysis of literary works, the prospects of using neural network solutions are shown, the creation of which is complicated by the existing uncertainty in choosing the type of neural network model, as well as in determining the set of input parameters. As a result of the studies, it was determined that in the task of recognizing the speaker’s identity by voice fragments of a fixed duration, it is advisable to use a type of neural network model such as a two-layer perceptron, the input parameters of which are associated with small-cepstral coefficients characterizing each of the quasi-stationary fragments of the analyzed voice signal, and the output parameters match of recognizable speakers. By computing experiments, it is proved that each of the quasistationary fragments should be described using 20 chalk-cepstral coefficients. At the same time, the recognition accuracy of the speaker using a two-layer perceptron is at the level of the best modern means of this purpose and is 8% higher than the recognition accuracy using a convolutional neural network such as LeNet. The need for further research in the direction of adapting the parameters of the two-layer perceptron to the recognition conditions under the influence of various kinds of interference was also established.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aitchanov, B., Korchenko, A., Tereykovskiy, I., Bapiyev, I.: Perspectives for using classical neural network models and methods of counteracting attacks on network resources of information systems. News Natl. Acad. Sci. Republic Kazakhstan ser. Geol. Tech. Sci. 5(425), 202–212 (2017)
Google Scholar
Jadhav, A.N., Dharwadkar, N.V.: A Speaker recognition system using Gaussian mixture model, EM algorithm and K-means clustering. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 10(11), 19–28 (2018)
Article Google Scholar
Akhmetov, B., Lakhno, V., Malyukov, V., Omarov, A., Abuova, K., Issaikin, D., Lakhno, M.: Developing a mathematical model and intellectual decision support system for the distribution of financial resources allocated for the elimination of emergency situations and technogenic accidents on railway transport. J. Theor. Appl. Inf. Technol. 97(16), 4401–4411 (2019)
Google Scholar
Akhmetov, B., Tereykovsky, I., Doszhanova, A., Tereykovskaya, L.: Determination of input parameters of the neural network model, intended for phoneme recognition of a voice signal in the systems of distance learning. Int. J. Electron. Telecommun. 64(4), 425–432 (2018)
Google Scholar
Altincay, H.: Speaker identification by combining multiple classifiers using Dempster-Shafer theory of evidence. Speech Commun. 41(4), 531–547 (2003)
Article Google Scholar
Campbell W., Sturim D., Reynolds D.: Support vector machines using GMM supervectors for speaker verification. IEEE Signal Process. Lett. 13(5), 308–311 (2006b)
Google Scholar
Drugman, T., Dutoit, T.: On the potential of glottal signatures for speaker recognition. In: Interspeech, pp. 2106–2109 (2010)
Google Scholar
Dychka, I., Tereikovskyi, I., Tereikovska, L., Pogorelov, V., Mussiraliyeva, S.: Deobfuscation of computer virus malware code with value state dependence graph. In: Advances in Intelligent Systems and Computing, vol. 754, pp. 370–379 (2018)
Google Scholar
Variani, E., Lei, X., McDermott, E., Moreno, I.L., Gonzalez-Dominguez, J.: Deep neural networks for small footprint text-dependent speaker verification. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4052–4056. IEEE (2014)
Google Scholar
Nijhawan, G., Soni, M.K.: A new design approach for speaker recognition using MFCC and VAD. IJIGSP 5(9), 43–49 (2013)
Article Google Scholar
Gnatyuk, S.: Critical aviation information systems cybersecurity. In: Meeting Security Challenges Through Data Analytics and Decision Support. NATO Science for Peace and Security Series, D: Information and Communication Security, vol. 47, no. 3, pp. 308–316. IOS Press Ebooks (2016)
Google Scholar
Gnatyuk, S., Sydorenko, V., Aleksander, M.: Unified data model for defining state critical information infrastructure in civil aviation. In: Proceedings of the 2018 IEEE 9th International Conference on Dependable Systems, Services and Technologies (DESSERT), Kyiv, Ukraine, 24–27 May 2018, pp. 37–42 (2018)
Google Scholar
Hu, Z., Tereykovskiy, I., Zorin, Y., Tereykovska, L., Zhibek, A.: Optimization of convolutional neural network structure for biometric authentication by face geometry. In: Advances in Intelligent Systems and Computing, vol. 754, pp 567–577 (2018)
Google Scholar
Ding Jr., I., Yen, C.-T., Hsu, Y.-M.: Developments of machine learning schemes for dynamic time-wrapping-based speech recognition. Math. Probl. Eng. 56–68 (2013)
Google Scholar
Karam, Z., Campbell, W.: A new kernel for SVM MLLR based speaker recognition. In: Proceedings of Interspeech 2007, Antwerp, Belgium, August 2007, pp. 290–293 (2007)
Google Scholar
Lakhno, V.A.: Algorithms for forming a knowledge base for decision support systems in cybersecurity tasks. In: Advances in Intelligent Systems and Computing, vol. 938, pp. 268–278 (2020)
Google Scholar
Lakhno, V.A., Kasatkin, D.Y., Blozva, A.I., Gusev, B.S.: Method and model of analysis of possible threats in user authentication in electronic information educational environment of the university. In: Advances in Intelligent Systems and Computing, vol. 938, pp. 600–609 (2020)
Google Scholar
McLaren, M., Lei, Y., Scheffer, N., Ferrer, L.: Application of convolutional neural networks to speaker recognition in noisy conditions. In: 15th Annual Conference of the International Speech Communication Association, Singapore, 14–18 September 2014, pp. 686–690. ISCA (2014)
Google Scholar
Singh, S., Kumar, A., Kolluri, D.R.: Efficient modelling technique based speaker recognition under limited speech data. Int. J. Image Graph. Signal Process. (IJIGSP) 8(11), 41–48 (2016)
Article Google Scholar
Sorokin, V.N.: Speaker verification using the spectral parameters of voice signal. J. Commun. Technol. Electron. 55(12), 156–157 (2010)
Google Scholar
Tereikovskyi, I., Chernyshev, D., Tereikovska, L.A., Mussiraliyeva, S., Akhmed, G.: The procedure for the determination of structural parameters of a convolutional neural network to fingerprint recognition. J. Theor. Appl. Inf. Technol. 97(8), 2381–2392 (2019)
Google Scholar
Zhang, W.-Q., Deng, Y., He, L., Liu, J.: Variant time-frequency cepstral features for speaker recognition. In: Interspeech, pp. 2122–2125 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Educational Information Technology, Central China Normal University, Wuhan, China
Zhengbing Hu
National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kiev, Ukraine
Ihor Tereikovskyi
Scientifically Research Institute of the Ministry of Internal Affairs, Kiev, Ukraine
Oleksandr Korystin
Kyiv National University of Construction and Architecture, Kiev, Ukraine
Victor Mihaylenko & Liudmyla Tereikovska

Authors

Zhengbing Hu
View author publications
You can also search for this author in PubMed Google Scholar
Ihor Tereikovskyi
View author publications
You can also search for this author in PubMed Google Scholar
Oleksandr Korystin
View author publications
You can also search for this author in PubMed Google Scholar
Victor Mihaylenko
View author publications
You can also search for this author in PubMed Google Scholar
Liudmyla Tereikovska
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ihor Tereikovskyi .

Editor information

Editors and Affiliations

School of Educational Information Technology, Central China Normal University, Wuhan, Hubei, China
Zhengbing Hu
Mechanical Engineering Research Institute, Russian Academy of Sciences, Moscow, Russia
Sergey Petoukhov
Faculty of Applied Mathematics, National Technical University of Ukraine, Kiev, Ukraine
Ivan Dychka
Halmos College of Natural Sciences and Oceanography, Nova Southeastern University, Davie, FL, USA
Matthew He

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, Z., Tereikovskyi, I., Korystin, O., Mihaylenko, V., Tereikovska, L. (2021). Two-Layer Perceptron for Voice Recognition of Speaker’s Identity. In: Hu, Z., Petoukhov, S., Dychka, I., He, M. (eds) Advances in Computer Science for Engineering and Education III. ICCSEEA 2020. Advances in Intelligent Systems and Computing, vol 1247. Springer, Cham. https://doi.org/10.1007/978-3-030-55506-1_46

Download citation

DOI: https://doi.org/10.1007/978-3-030-55506-1_46
Published: 06 August 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-55505-4
Online ISBN: 978-3-030-55506-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics