Automatic Preprocessing Technique for Detection of Corrupted Speech Signal Fragments for the Purpose of Speaker Recognition
In this paper we propose a preprocessing technique which allows to detect clicks, tones, overloads, clipping, etc., as well as to discover the parts of good-quality speech signal. As a result the performance of the speaker recognition system increases significantly. It should be noted that when describing noise detectors we aim only to provide a full list of algorithms we used as well as their parameters that we obtained in our experiments. The main goal of the paper is to demonstrate that using a set of simple detectors is very effective in detecting speech for speaker recognition task under the conditions of real noise.
KeywordsPreprocessing Speaker recognition Speech processing
This work was financially supported by the Ministry of Education and Science of the Russian Federation, contract 14.575.21.0033 (RFMEFI57514X0033), and by the Government of the Russian Federation, Grant 074-U01.
- 1.Aleinik, S., Matveev, Y.: Detection of clipped fragments in speech signals. Int. J. Elect. Electron. Sci. Eng. 8(2), 74–80 (2014)Google Scholar
- 2.Chandra, C., Moore, M.S., Mitra, S.: An efficient method for the removal of impulse noise from speech and audio signals. In: Proceedings of the 1998 IEEE International Symposium on Circuits and Systems, ISCAS 1998, vol. 4, pp. 206–208. IEEE (1998)Google Scholar
- 4.Lokhanova, A., Simonchik, K., Kozlov, A.: Music detection algorithm in problems of speech processing. In: Proceedings of 12th International Conference and Exhibition Digital Signal Processing and its Applications (DSPA 2010), vol.1, pp. 210–213 (2010)Google Scholar
- 5.Seyerlehner, K., Pohle, T., Schedl, M., Widmer, G.: Automatic music detection in television productions. In: Proceedings of the 10th International Conference on Digital Audio Effects (DAFx 2007), pp. 10–15. Citeseer (2007)Google Scholar
- 6.Simonchik, K., Galinina, O., Kapustin, A.: Voice activity detector based on pitch statistics for speaker recognition. Nauchno-tekhnicheskie vedomosti SPbGPU 103.4, pp. 7–11 (2010)Google Scholar