Abstract
Preliminary signal processing methods used to create new tools to examine materials and digital sound recording means are described. It is shown that using information redundancy when creating a training base for deep learning neural networks used for such examination increases speaker identification efficiency based on voice characteristic parameters by about 15%. It is shown that the proposed processing methods enable speaker identification based on phonograms that are 1 second long.
Similar content being viewed by others
References
V. I. Solovyov, O. V. Rybalskiy, V. V. Zhuravel, and N. V. Semenova, “Analyzing the models of speech recognition on the basis of neural networks of deep learning for examination of digital phonograms,” Cybern. Syst. Analysis, Vol. 57, No. 1, 133–138 (2021). https://doi.org/10.1007/s10559-021-00336-y.
V. I. Solovyov, O. V. Rybalskiy, V. V. Zhuravel, and V. K. Zheleznyak, “Application of neuron networks of deep learning for exposures editing of digital phonograms,” Proc. of the National Academy of Sciences of Belarus, Physical-Technical Series, Vol. 65, No. 4, 506–512 (2020). https://doi.org/10.29235/1561-8358-2020-65-4-506-512.
V. I. Solovyov, O. V. Rybalskiy, and V. V. Zhuravel, “Method of exposure of signs of the digital editing in phonograms with the use of neuron networks of the deep learning,” J. Autom. Inform. Sci., Vol. 52, Iss. 1, 22–28 (2020). https://doi.org/10.1615/JAutomatInfScien.v52.i1.30.
V. I. Solovyov, O. V. Rybalskiy, and V. V. Zhuravel, “Substantiating the fundamental fitness of deep learning neural networks for construction of a phonogram digital processing detection system,” Cybern. Syst. Analysis, Vol. 56, No. 2, 326–330 (2020). https://doi.org/10.1007/s10559-020-00249-2.
Y. Lei, N. Scheffer, L. Ferrer, and M. McLaren, “A novel scheme for speaker recognition using a phonetically-aware deep neural network,” in: Proc. 2014 IEEE Intern. Conf. on Acoustics, Speech and Signal Processing (ICASSP) (Florence, Italy, May 4–9, 2014), IEEE (2014), pp. 1695–1699.
P. Kenny, V. Gupta, T. Stafylakis, P. Ouellet, and J. Alam, “Deep neural networks for extracting Baum–Welch statistics for speaker recognition,” in: Proc. Odyssey 2014: The Speaker and Language Recognition Workshop (Joensuu, Finland, June 16–19, 2014), Joensuu (2014), pp. 293–298. URL: http://cs.uef.fi/odyssey2014/program/pdfs/28.pdf.
S. M. Kassin, I. E. Dror, and J. Kukucka, “The forensic confirmation bias: Problems, perspectives, and proposed solutions,” J. of Applied Research in Memory and Cognition, Vol. 2, Iss. 1, 42–52 (2013).
S. Singh, “Forensic and automatic speaker recognition system,” Intern. J. of Electrical and Computer Engineering (IJECE), Vol. 8, No. 5, 2804–2811 (2018).
A. Amali Mary Bastina and N. Rama, “Biometric identification and authentication providence using fingerprint for cloud data access,” Intern. J. of Electrical and Computer Engineering, Vol. 7, No. 1, 408–416 (2017).
J. H. L. Hansen and T. Hasan, “Speaker recognition by machines and humans,” IEEE Signal Processing Magazine, Vol. 32, Iss. 6, 74–99 (2015).
J. L. Flanagan, Speech Analysis Synthesis and Perception, Springer-Verlag, Berlin–Heidelberg–New York (1965).
G. Fant, Acoustic Theory of Speech Production, Mouton, The Hague, The Netherlands (1960).
S. Mallat, A Wavelet Tour of Signal Processing, Acaemic Press, New Yîãk (1999).
Yu. I. Alexandrov, Psychophysiology [in Russian], Nauka, Moscow–St. Petersburg (2006).
V. P. Morozov, “Psychoacoustic aspects of speech perception,” in: N. P. Bekhtereva (ed.), Mechanisms of Human Brain Activity [in Russian], Nauka, Moscow (1988), pp. 578–607.
Author information
Authors and Affiliations
Corresponding author
Additional information
Translated from Kibernetyka ta Systemnyi Analiz, No. 1, January–February, 2022, pp. 11–20.
Rights and permissions
About this article
Cite this article
Solovyov, V.I., Rybalskiy, O.V., Zhuravel, V.V. et al. Information Redundancy in Constructing Systems for Audio Signal Examination on Deep Learning Neural Networks. Cybern Syst Anal 58, 8–15 (2022). https://doi.org/10.1007/s10559-022-00429-2
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10559-022-00429-2