Sequential Three-Way Decisions in Efficient Classification of Piecewise Stationary SpeechSignals

  • Andrey V. Savchenko
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10314)


In this paper it is proposed to improve performance of the automatic speech recognition by using sequential three-way decisions. At first, the largest piecewise quasi-stationary segments are detected in the speech signal. Every segment is classified using the maximum a-posteriori (MAP) method implemented with the Kullback-Leibler minimum information discrimination principle. The three-way decisions are taken for each segment using the multiple comparisons and asymptotical properties of the Kullback-Leibler divergence. If the non-commitment option is chosen for any segment, it is divided into small subparts, and the decision-making is sequentially repeated by fusing the classification results for each subpart until accept or reject options are chosen or the size of each subpart becomes relatively low. Thus, each segment is associated with a hierarchy of variable-scale subparts (granules in rough set theory). In the experimental study the proposed procedure is used in speech recognition with Russian language. It was shown that our approach makes it possible to achieve high efficiency even in the presence of high level of noise in the observed utterance.


Signal processing Speech recognition Three-way decisions Sequential analysis Granular computing Kullback-Leibler divergence 



The work was prepared within the framework of the Academic Fund Program at the National Research University Higher School of Economics (HSE) in 2017 (grant No 17-05-0007) and is supported by the Russian Academic Excellence Project “5–100” and Russian Federation President grant no. MD-306.2017.9.


  1. 1.
    Tyagi, V., Bourlard, H., Wellekens, C.: On variable-scale piecewise stationary spectral analysis of speech signals for ASR. Speech Commun. 48, 1182–1191 (2006)CrossRefGoogle Scholar
  2. 2.
    Savchenko, A.V., Belova, N.S.: Statistical testing of segment homogeneity in classification of piecewise-regular objects. Int. J. Appl. Math. Comput. Sci. 25, 915–925 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Huang, K., Aviyente, S.: Sparse representation for signal classification. In: Advances of Neural Information Processing Systems (NIPS), pp. 609–616. MIT Press (2006)Google Scholar
  4. 4.
    Khan, M.R., Padhi, S.K., Sahu, B.N., Behera, S.: Non stationary signal analysis and classification using FTT transform and naive bayes classifier. In: IEEE Power, Communication and Information Technology Conference (PCITC), pp. 967–972. IEEE Press (2015)Google Scholar
  5. 5.
    Savchenko, A.V.: Search Techniques in Intelligent Classification Systems. Springer International Publishing, New York (2016)CrossRefzbMATHGoogle Scholar
  6. 6.
    Benesty, J., Sondhi, M.M., Huang, Y.: Springer Handbook of Speech Processing. Springer, Berlin (2008)CrossRefGoogle Scholar
  7. 7.
    Peebles, P.Z., Read, J., Read, P.: Probability, Random Variables, and Random Signal Principles. McGraw-Hill, New York (2001)Google Scholar
  8. 8.
    Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. The MIT Press, Cambridge (2016)Google Scholar
  9. 9.
    Yu, D., Deng, L.: Automatic Speech Recognition: A Deep Learning Approach. Springer, New York (2014)zbMATHGoogle Scholar
  10. 10.
    Sak, H., Senior, A.W., Beaufays, F.: Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In: Interspeech, pp. 338–342 (2014)Google Scholar
  11. 11.
    Stan, A., Mamiya, Y., Yamagishi, J., Bell, P., Watts, O., Clark, R.A., King, S.: ALISA: an automatic lightly supervised speech segmentation and alignment tool. Comput. Speech Lang. 35, 116–133 (2016)CrossRefGoogle Scholar
  12. 12.
    Yao, Y.Y.: Granular computing and sequential three-way decisions. In: Lingras, P., Wolski, M., Cornelis, C., Mitra, S., Wasilewski, P. (eds.) RSKT 2013. LNCS (LNAI), vol. 8171, pp. 16–27. Springer, Heidelberg (2013)Google Scholar
  13. 13.
    Savchenko, A.V.: Fast multi-class recognition of piecewise regular objects based on sequential three-way decisions and granular computing. Knowl.-Based Syst. 91, 252–262 (2016)CrossRefGoogle Scholar
  14. 14.
    Li, H., Zhang, L., Huang, B., Zhou, X.: Sequential three-way decision and granulation for cost-sensitive face recognition. Knowl.-Based Syst. 91, 241–251 (2016)CrossRefGoogle Scholar
  15. 15.
    Yao, Y.: Three-way decisions with probabilistic rough sets. Inf. Sci. 180, 341–353 (2010)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Yao, Y.: Interval sets and three-way concept analysis in incomplete contexts. Int. J. Mach. Learn. Cybern. 8(1), 1–18 (2017)CrossRefGoogle Scholar
  17. 17.
    Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishers, Norwell, MA, USA (1992)zbMATHGoogle Scholar
  18. 18.
    Li, H., Zhang, L., Zhou, X., Huang, B.: Cost-sensitive sequential three-way decision modeling using a deep neural network. Int. J. Approx. Reason. 85, 68–78 (2017)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Li, Y., Zhang, Z.H., Chen, W.B., Min, F.: TDUP: an approach to incremental mining of frequent itemsets with three-way-decision pattern updating. Int. J. Mach. Learn. Cybern. 8(2), 441–453 (2017)CrossRefGoogle Scholar
  20. 20.
    Ren, R., Wei, L.: The attribute reductions of three-way concept lattices. Knowl.-Based Syst. 99, 92–102 (2016)CrossRefGoogle Scholar
  21. 21.
    Yao, J., Azam, N.: Web-based medical decision support systems for three-way medical decision making with game-theoretic rough sets. IEEE Trans. Fuzzy Syst. 23(1), 3–15 (2015)CrossRefGoogle Scholar
  22. 22.
    Zhang, H.R., Min, F., Shi, B.: Regression-based three-way recommendation. Inf. Sci. 378, 444–461 (2017)CrossRefGoogle Scholar
  23. 23.
    Li, W., Huang, Z., Li, Q.: Three-way decisions based software defect prediction. Knowl.-Based Syst. 91, 263–274 (2016)CrossRefGoogle Scholar
  24. 24.
    Pedrycz, W.: Granular Computing: Analysis and Design of Intelligent Systems. CRC Press, Boca Raton (2013)CrossRefGoogle Scholar
  25. 25.
    Wang, X., Pedrycz, W., Gacek, A., Liu, X.: From numeric data to information granules: a design through clustering and the principle of justifiable granularity. Knowl.-Based Syst. 101, 100–113 (2016)CrossRefGoogle Scholar
  26. 26.
    Itakura, F.: Minimum prediction residual principle applied to speech recognition. IEEE Trans. Acoust. Speech Signal Process. 23, 67–72 (1975)CrossRefGoogle Scholar
  27. 27.
    Savchenko, A.V., Savchenko, L.V.: Towards the creation of reliable voice control system based on a fuzzy approach. Pattern Recognit. Lett. 65, 145–151 (2015)CrossRefGoogle Scholar
  28. 28.
    Gray, R.M., Buzo, A., Gray, J.A., Matsuyama, Y.: Distortion Measures for Speech Processing. IEEE Trans. Acoust. Speech Signal Process. 28, 367–376 (1980)CrossRefzbMATHGoogle Scholar
  29. 29.
    Savchenko, V.V., Savchenko, A.V.: Information-theoretic analysis of efficiency of the phonetic encoding-decoding method in automatic speech recognition. J. Commun. Technol. Electron. 61, 430–435 (2016)CrossRefGoogle Scholar
  30. 30.
    Kullback, S.: Information Theory and Statistics. Dover Publications, New York (1997)zbMATHGoogle Scholar
  31. 31.
    Marple, S.L.: Digital Spectral Analysis: With Applications. Prentice Hall, Upper Saddle River (1987)Google Scholar
  32. 32.
    Benzeghiba, M., De Mori, R., Deroo, O., Dupont, S., Erbes, T., Jouvet, D., Fissore, L., Laface, P., Mertins, A., Ris, C., Rose, R., Tyagi, V., Wellekens, C.: Automatic speech recognition and speech variability: a review. Speech Commun. 49, 763–786 (2007)CrossRefGoogle Scholar
  33. 33.
    Lingras, P., Chen, M., Miao, D.: Rough multi-category decision theoretic framework. In: Wang, G., Li, T., Grzymala-Busse, J.W., Miao, D., Skowron, A., Yao, Y. (eds.) RSKT 2008. LNCS, vol. 5009, pp. 676–683. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-79721-0_90 CrossRefGoogle Scholar
  34. 34.
    Zhou, B.: Multi-class decision-theoretic rough sets. Int. J. Approx. Reason. 55(1), 211–224 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  35. 35.
    Ju, H.R., Li, H.X., Yang, X.B., Zhou, X.Z.: Cost-sensitive rough set: a multi-granulation approach. Knowl.-Based Syst. 123, 137–153 (2017)CrossRefGoogle Scholar
  36. 36.
    Deng, G., Jia, X.: A decision-theoretic rough set approach to multi-class cost-sensitive classification. In: Flores, V., et al. (eds.) IJCRS 2016. LNCS, vol. 9920, pp. 250–260. Springer, Cham (2016). doi: 10.1007/978-3-319-47160-0_23 CrossRefGoogle Scholar
  37. 37.
    Liu, D., Li, T., Li, H.: A multiple-category classification approach with decision-theoretic rough sets. Fundam. Inform. 115(2–3), 173–188 (2012)MathSciNetzbMATHGoogle Scholar
  38. 38.
    Hochberg, Y., Tamhane, A.C.: Multiple Comparison Procedures. Wiley, Hoboken (2009)zbMATHGoogle Scholar
  39. 39.
    Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Royal Stat. Soc. Series B (Methodol.) 57(1), 289–300 (1995)MathSciNetzbMATHGoogle Scholar
  40. 40.
    Savchenko, A.V., Savchenko, L.V.: Classification of a sequence of objects with the fuzzy decoding method. In: Cornelis, C., Kryszkiewicz, M., Ślȩzak, D., Ruiz, E.M., Bello, R., Shang, L. (eds.) RSCTC 2014. LNCS, vol. 8536, pp. 309–318. Springer, Cham (2014). doi: 10.1007/978-3-319-08644-6_32 Google Scholar
  41. 41.
    Savchenko, A.V.: Semi-automated speaker adaptation: how to control the quality of adaptation? In: Elmoataz, A., Lezoray, O., Nouboud, F., Mammass, D. (eds.) ICISP 2014. LNCS, vol. 8509, pp. 638–646. Springer, Cham (2014). doi: 10.1007/978-3-319-07998-1_73 Google Scholar
  42. 42.
    Savchenko, A.V.: Phonetic words decoding software in the problem of Russian speech recognition. Autom. Remote Control 74(7), 1225–1232 (2013)CrossRefGoogle Scholar
  43. 43.
    Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P., Silovsky, J.: The kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding. IEEE Signal Processing Society (2011)Google Scholar
  44. 44.
    Gillick, L., Cox, S.: Some statistical issues in the comparison of speech recognition algorithms. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 532–535 (1989)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Laboratory of Algorithms and Technologies for Network AnalysisNational Research University Higher School of EconomicsNizhny NovgorodRussian Federation

Personalised recommendations