Skip to main content
Log in

Information Redundancy in Constructing Systems for Audio Signal Examination on Deep Learning Neural Networks

  • Published:
Cybernetics and Systems Analysis Aims and scope

Abstract

Preliminary signal processing methods used to create new tools to examine materials and digital sound recording means are described. It is shown that using information redundancy when creating a training base for deep learning neural networks used for such examination increases speaker identification efficiency based on voice characteristic parameters by about 15%. It is shown that the proposed processing methods enable speaker identification based on phonograms that are 1 second long.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. V. I. Solovyov, O. V. Rybalskiy, V. V. Zhuravel, and N. V. Semenova, “Analyzing the models of speech recognition on the basis of neural networks of deep learning for examination of digital phonograms,” Cybern. Syst. Analysis, Vol. 57, No. 1, 133–138 (2021). https://doi.org/10.1007/s10559-021-00336-y.

    Article  MATH  Google Scholar 

  2. V. I. Solovyov, O. V. Rybalskiy, V. V. Zhuravel, and V. K. Zheleznyak, “Application of neuron networks of deep learning for exposures editing of digital phonograms,” Proc. of the National Academy of Sciences of Belarus, Physical-Technical Series, Vol. 65, No. 4, 506–512 (2020). https://doi.org/10.29235/1561-8358-2020-65-4-506-512.

  3. V. I. Solovyov, O. V. Rybalskiy, and V. V. Zhuravel, “Method of exposure of signs of the digital editing in phonograms with the use of neuron networks of the deep learning,” J. Autom. Inform. Sci., Vol. 52, Iss. 1, 22–28 (2020). https://doi.org/10.1615/JAutomatInfScien.v52.i1.30.

  4. V. I. Solovyov, O. V. Rybalskiy, and V. V. Zhuravel, “Substantiating the fundamental fitness of deep learning neural networks for construction of a phonogram digital processing detection system,” Cybern. Syst. Analysis, Vol. 56, No. 2, 326–330 (2020). https://doi.org/10.1007/s10559-020-00249-2.

  5. Y. Lei, N. Scheffer, L. Ferrer, and M. McLaren, “A novel scheme for speaker recognition using a phonetically-aware deep neural network,” in: Proc. 2014 IEEE Intern. Conf. on Acoustics, Speech and Signal Processing (ICASSP) (Florence, Italy, May 4–9, 2014), IEEE (2014), pp. 1695–1699.

  6. P. Kenny, V. Gupta, T. Stafylakis, P. Ouellet, and J. Alam, “Deep neural networks for extracting Baum–Welch statistics for speaker recognition,” in: Proc. Odyssey 2014: The Speaker and Language Recognition Workshop (Joensuu, Finland, June 16–19, 2014), Joensuu (2014), pp. 293–298. URL: http://cs.uef.fi/odyssey2014/program/pdfs/28.pdf.

  7. S. M. Kassin, I. E. Dror, and J. Kukucka, “The forensic confirmation bias: Problems, perspectives, and proposed solutions,” J. of Applied Research in Memory and Cognition, Vol. 2, Iss. 1, 42–52 (2013).

  8. S. Singh, “Forensic and automatic speaker recognition system,” Intern. J. of Electrical and Computer Engineering (IJECE), Vol. 8, No. 5, 2804–2811 (2018).

    Article  Google Scholar 

  9. A. Amali Mary Bastina and N. Rama, “Biometric identification and authentication providence using fingerprint for cloud data access,” Intern. J. of Electrical and Computer Engineering, Vol. 7, No. 1, 408–416 (2017).

    Google Scholar 

  10. J. H. L. Hansen and T. Hasan, “Speaker recognition by machines and humans,” IEEE Signal Processing Magazine, Vol. 32, Iss. 6, 74–99 (2015).

  11. J. L. Flanagan, Speech Analysis Synthesis and Perception, Springer-Verlag, Berlin–Heidelberg–New York (1965).

    Book  Google Scholar 

  12. G. Fant, Acoustic Theory of Speech Production, Mouton, The Hague, The Netherlands (1960).

    Google Scholar 

  13. S. Mallat, A Wavelet Tour of Signal Processing, Acaemic Press, New Yîãk (1999).

    MATH  Google Scholar 

  14. Yu. I. Alexandrov, Psychophysiology [in Russian], Nauka, Moscow–St. Petersburg (2006).

  15. V. P. Morozov, “Psychoacoustic aspects of speech perception,” in: N. P. Bekhtereva (ed.), Mechanisms of Human Brain Activity [in Russian], Nauka, Moscow (1988), pp. 578–607.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to V. I. Solovyov.

Additional information

Translated from Kibernetyka ta Systemnyi Analiz, No. 1, January–February, 2022, pp. 11–20.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Solovyov, V.I., Rybalskiy, O.V., Zhuravel, V.V. et al. Information Redundancy in Constructing Systems for Audio Signal Examination on Deep Learning Neural Networks. Cybern Syst Anal 58, 8–15 (2022). https://doi.org/10.1007/s10559-022-00429-2

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10559-022-00429-2

Keywords

Navigation