Skip to main content

Approaches to Solving Problems of Markov Modeling Training in Speech Recognition

  • Conference paper
  • First Online:
Emerging Trends and Applications in Artificial Intelligence ( ICETAI 2023)

Abstract

The article discusses approaches to solving problems of learning Markov modeling in speech recognition. A Markov process is a stochastic process consisting of a sequence of random states, where the probability of transition from one state to another depends only on the current state and does not depend on previous states. The result of observing such a process is a sequence of states that the system goes through during the observation period. The task of model training is considered the most difficult when using Markov models in recognition systems, since there is no known unique and universal way to solve it, and the quality of recognition depends on the result of model training. Therefore, special attention must be paid to training the model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ognev, I.V.: Preliminary processing of a speech signal for building a database of pronunciations of single words. In: Ognev, I.V., Paramonov, P.A. (eds.) Information tools and Technologies: tr. XX International Science-Technical Conference, pp. 53–58. MPEI, Moscow (2012)

    Google Scholar 

  2. Popov, E.V.: Communication with computers in natural language, 2nd edn. Stereotypical, 360 p. Editorial URSS, Moscow (2004)

    Google Scholar 

  3. Niyozmatova, N.A., Mamatov, N.S., Tulyaganova, Sh.A., Samijonov, A.N., Samijonov, B.N.: Methods for determining speech activity of uzbek speech in recognition systems. In: AIP Conference Proceedings, vol. 2789, no. 1, p. 050019 (2023). https://doi.org/10.1063/5.0145438

  4. Mamatov, N.S., Niyozmatova, N.A., Yuldoshev, Y.S., Abdullaev, S.S., Samijonov, A.N.: Automatic speech recognition on the neutral network based on attention mechanism. In: Zaynidinov, H., Singh, M., Tiwary, U.S., Singh, D. (eds.) IHCI 2022. LNCS, vol. 13741, pp. 100–108. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-27199-1_11

    Chapter  Google Scholar 

  5. Mamatov, N.S., Niyozmatova, N.A., Samijonov, A.N., Samijonov, B.N.: Construction of language models for Uzbek language. In: 2022 International Conference on Information Science and Communications Technologies (ICISCT), Tashkent, Uzbekistan, pp. 1–4 (2022). https://doi.org/10.1109/ICISCT55600.2022.10146788

  6. Niyozmatova, N.A., Mamatov, N.S., Otaxonova, B.I., Samijonov, A.N., Erejepov, K.K.: Classification based on decision trees and neural networks. In: 2021 International Conference on Information Science and Communications Technologies (ICISCT), Tashkent, Uzbekistan, pp. 01–04 (2021). https://doi.org/10.1109/ICISCT52966.2021.9670345

  7. Mamatov, N.S., Niyozmatova, N.A., Abdullaev, S.S., Samijonov, A.N., Erejepov, K.K.: Speech recognition based on transformer neural networks. In: 2021 International Conference on Information Science and Communications Technologies (ICISCT), Tashkent, Uzbekistan, 2021, pp. 1–5 (2021). https://doi.org/10.1109/ICISCT52966.2021.9670093

  8. Mamatov, N., Niyozmatova, N., Samijonov, A.: Software for preprocessing voice signals. Int. J. Appl. Sci. Eng. 18, 2020163 (2021). https://doi.org/10.6703/IJASE.202103_18(1).006

    Article  Google Scholar 

  9. Narzillo, M., Abdurashid, S., Parakhat, N., Nilufar, N.: Automatic speaker identification by voice based on vector quantization method. Int. J. Innov. Technol. Explor. Eng. 8(10), 2443–2445 (2019). https://doi.org/10.35940/ijitee.J9523.0881019

  10. Wiedecke, B., Narzillo, M., Payazov, M., Abdurashid, S.: Acoustic signal analysis and identification. Int. J. Innov. Technol. Explor. Eng. 8(10), 2440–2442 (2019). https://doi.org/10.35940/ijitee.J9522.0881019

    Article  Google Scholar 

  11. Narzillo, M., Abdurashid, S., Parakhat, N., Nilufar, N.: Karakalpak speech recognition with CMU sphinx. Int. J. Innov. Technol. Explor. Eng. 8(10), 2446–2448 (2019). https://doi.org/10.35940/ijitee.J9524.0881019

    Article  Google Scholar 

  12. Mosleh, M.: FPGA implementation of a linear systolic array for speech recognition based on HMM. In: Mosleh, M., Setayeshi, S., Mehdi Lotfinejad, M., Mirshekari, A. (eds.) The 2nd International Conference on Computer and Automation Engineering (ICCAE), vol. 3, pp. 75–78 (2010)

    Google Scholar 

  13. Ikonin, S.Yu., Sarana, D.V.: SPIRIT ASP engine automatic speech recognition system. Digital Signal Processing (2003)

    Google Scholar 

  14. Sapunov, G.V., Trufanov, F.A.: Genetic algorithms as a method for optimizing hidden Markov models in problems of speech recognition. In: Information Technologies in Computer Systems. Issue 3. Under the general editorship of prof. Azarova V.N. MIEM, Moscow (2004)

    Google Scholar 

  15. Marczyk, A.: Genetic Algorithms and Evolutionary Computation (2004). http://www.talkorigins.org/faqs/genalg/genalg.html

  16. Komarov, A.N.: Basic cellular ensembles of associative oscillatory environments and the possibility of their expansion. In: Komarov, A.N., Ognev, I.V., Podolin, P.B. (eds.) Computational systems and information processing technologies: Interuniversity. Sat. scientific tr. – Issue, vol. 5, no. 30, 200 p. Inf.-ed. center of PGU, Penza (2006)

    Google Scholar 

  17. Ognev, I.V.: Character recognition in an associative oscillatory environment. In: Ognev, I.V., Podolin, P.B. (eds.) News of Higher Educational Institutions. Volga region. Ser. Technical Science, no. 6, pp. 55–66 (2006)

    Google Scholar 

  18. Elliott, L., Ingham, D., Kyne, A., Mera, N., Pourkashanian, M., Whittaker, S.: Efficient clustering-based genetic algorithms in chemical kinetic modelling. In: Deb, K. (ed.) GECCO 2004. LNCS, vol. 3103, pp. 932–944. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24855-2_106

    Chapter  Google Scholar 

  19. Sastry, K., O’Reilly, U.M., Goldberg, D.E.: Population sizing for genetic programming based upon decision making. IlliGAL Report No. 2004028 (2004)

    Google Scholar 

  20. Samijonov, A., Mamatov, N., Niyozmatova, N.A., Yuldoshev, Y., Asraev, M.: Gradient method for determining non-informative features on the basis of a homogeneous criterion with a positive degree. In: IOP Conference Series: Materials Science and Engineering, vol. 919, no. 4 (2020). https://doi.org/10.1088/1757-899X/919/4/042011

  21. Mamatov, N., Niyozmatova, N.A., Samijonov, A., Juraev, S., Abdullayeva, B.: The choice of informative features based on heterogeneous functionals. In: IOP Conference Series: Materials Science and Engineering, vol. 919, no. 4 (2020). https://doi.org/10.1088/1757-899X/919/4/042009

  22. Mamatov, N.S., Samijonov, A.N., Yuldoshev, Y., Khusan, R.: Selection the informative features on the basis of interrelationship of features. In: Techno-Societal 2018 - Proceedings of the 2nd International Conference on Advanced Technologies for Societal Applications, vol. 2, pp. 121–129 (2020). https://doi.org/10.1007/978-3-030-16962-6_13

  23. Fazilov, S., Mamatov, N., Samijonov, A., Abdullaev, S.: Reducing the dimensionality of feature space in pattern recognition tasks. J. Phys. Conf. Ser. 1441(1), 012139 (2020). https://doi.org/10.1088/1742-6596/1441/1/012139

    Article  Google Scholar 

  24. Mamatov, N., Samijonov, A., Niyozmatova, N.: Determination of non-informative features based on the analysis of their relationships. J. Phys. Conf. Ser. 1441(1), 012149 (2020). https://doi.org/10.1088/1742-6596/1441/1/012149

    Article  Google Scholar 

  25. Niyozmatova, N.A., Mamatov, N., Samijonov, A., Mamadalieva, N., Abdullayeva, B.M.: Unconditional discrete optimization of linear-fractional function “-1”-order. In: IOP Conference Series: Materials Science and Engineering, vol. 862, no. 4, p. 042028 (2020). https://doi.org/10.1088/1757-899X/862/4/042028

  26. Niyozmatova, N.A., Mamatov, N., Samijonov, A., Rahmonov, E., Juraev, S.: Method for selecting informative and non-informative features. In: IOP Conference Series: Materials Science and Engineering, vol. 919, no. 4 (2020). https://doi.org/10.1088/1757-899X/919/4/042013

  27. Fazilov, S., Mamatov, N.: Formation an informative description of recognizable objects. J. Phys.: Conf. Ser. 1210(1) (2019). https://doi.org/10.1088/1742-6596/1210/1/012043

  28. Mamatov, N., Samijonov, A., Yuldashev, Z.: Selection of features based on relationships. J. Phys. Conf. Ser. 1260(10), 102008 (2019). https://doi.org/10.1088/1742-6596/1260/10/102008

    Article  Google Scholar 

  29. Shavkat, F., Narzillo, M., Abdurashid, S.: Selection of significant features of objects in the classification data processing. Int. J. Recent Technol. Eng. 8(2 Special Issue 11), 3790–3794 (2019). https://doi.org/10.35940/ijrte.B1494.0982S1119

  30. Mamatov, N., Samijonov, A., Yuldashev, Z., Niyozmatova, N.: Discrete optimization of linear fractional functionals. In: 2019 15th International Asian School-Seminar Optimization Problems of Complex Systems, OPCS 2019, pp. 96–99 ((2019)). https://doi.org/10.1109/OPCS.2019.8880208

  31. Shavkat, F., Narzillo, M., Nilufar, N.: Developing methods and algorithms for forming of informative features’ space on the base K-types uniform criteria. Int. J. Recent Technol. Eng. 8(2 Special Issue 11), 3784–3786 (2019). https://doi.org/10.35940/ijrte.B1492.0982S1119

  32. Nagy, P., Németh, G.: Improving HMM speech synthesis of interrogative sentences by pitch track transformations. Speech Commun. 82C(September 2016), 97–112 (2016). https://doi.org/10.1016/j.specom.2016.06.005

  33. Daridi, F., Kharma, N., Salik, J.F.N.: Parameterless genetic algorithms: review and innovation. IEEE Can. Rev. (47) (2004)

    Google Scholar 

  34. Aida-ZadeК, R.: Investigation of combined use of MFCC and LPC features in speech recognition systems. Aida-Zade, К.R., Ardil, C., Rustamov, S.S. (eds.) World Acadamic of Science, Engineering and Technology (2006)

    Google Scholar 

  35. Noisy channel model (2020). https://en.wikipedia.org/wiki/Noisy_channel_model. Accessed 12 Apr 2020

  36. Watanabe, S., et al.: Hybrid CTC. Attention Archit. End-to-End 11(8), 1240–1253 (2017)

    Google Scholar 

  37. Hannun “Sequence Modeling with CTC”. Distill (2017). https://distill.pub/2017/ctc/

  38. Graves, A., Fernandez, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: ICML (2006)

    Google Scholar 

  39. Chan, W., et al.: Listen, attend and spell. arXiv:1508.01211 (2015)

  40. Amodei, D., et al.: Deep Speech2: end-to-end speech recognition in English and Mandarin. arXiv:1512.02595 (2016)

  41. Zeghidour, N., Usunier, N., Synnaeve, G., Collobert, R., Dupoux, E.: End-to-end speech recognition from the raw waveform. arXiv:1806.07098 (2018)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to D. T. Muxamediyeva .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Muxamediyeva, D.T., Niyozmatova, N.A., Sobirov, R.A., Samijonov, B.N., Khamidov, E.K. (2024). Approaches to Solving Problems of Markov Modeling Training in Speech Recognition. In: García Márquez, F.P., Jamil, A., Hameed, A.A., Segovia Ramírez, I. (eds) Emerging Trends and Applications in Artificial Intelligence. ICETAI 2023. Lecture Notes in Networks and Systems, vol 960. Springer, Cham. https://doi.org/10.1007/978-3-031-56728-5_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-56728-5_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-56727-8

  • Online ISBN: 978-3-031-56728-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics