Approaches to Solving Problems of Markov Modeling Training in Speech Recognition

Muxamediyeva, D. T.; Niyozmatova, N. A.; Sobirov, R. A.; Samijonov, B. N.; Khamidov, E. Kh.

doi:10.1007/978-3-031-56728-5_29

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 960))

Included in the following conference series:

International Conference on Emerging Trends and Applications in Artificial Intelligence

Abstract

The article discusses approaches to solving problems of learning Markov modeling in speech recognition. A Markov process is a stochastic process consisting of a sequence of random states, where the probability of transition from one state to another depends only on the current state and does not depend on previous states. The result of observing such a process is a sequence of states that the system goes through during the observation period. The task of model training is considered the most difficult when using Markov models in recognition systems, since there is no known unique and universal way to solve it, and the quality of recognition depends on the result of model training. Therefore, special attention must be paid to training the model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ognev, I.V.: Preliminary processing of a speech signal for building a database of pronunciations of single words. In: Ognev, I.V., Paramonov, P.A. (eds.) Information tools and Technologies: tr. XX International Science-Technical Conference, pp. 53–58. MPEI, Moscow (2012)
Google Scholar
Popov, E.V.: Communication with computers in natural language, 2^nd edn. Stereotypical, 360 p. Editorial URSS, Moscow (2004)
Google Scholar
Niyozmatova, N.A., Mamatov, N.S., Tulyaganova, Sh.A., Samijonov, A.N., Samijonov, B.N.: Methods for determining speech activity of uzbek speech in recognition systems. In: AIP Conference Proceedings, vol. 2789, no. 1, p. 050019 (2023). https://doi.org/10.1063/5.0145438
Mamatov, N.S., Niyozmatova, N.A., Yuldoshev, Y.S., Abdullaev, S.S., Samijonov, A.N.: Automatic speech recognition on the neutral network based on attention mechanism. In: Zaynidinov, H., Singh, M., Tiwary, U.S., Singh, D. (eds.) IHCI 2022. LNCS, vol. 13741, pp. 100–108. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-27199-1_11
Chapter Google Scholar
Mamatov, N.S., Niyozmatova, N.A., Samijonov, A.N., Samijonov, B.N.: Construction of language models for Uzbek language. In: 2022 International Conference on Information Science and Communications Technologies (ICISCT), Tashkent, Uzbekistan, pp. 1–4 (2022). https://doi.org/10.1109/ICISCT55600.2022.10146788
Niyozmatova, N.A., Mamatov, N.S., Otaxonova, B.I., Samijonov, A.N., Erejepov, K.K.: Classification based on decision trees and neural networks. In: 2021 International Conference on Information Science and Communications Technologies (ICISCT), Tashkent, Uzbekistan, pp. 01–04 (2021). https://doi.org/10.1109/ICISCT52966.2021.9670345
Mamatov, N.S., Niyozmatova, N.A., Abdullaev, S.S., Samijonov, A.N., Erejepov, K.K.: Speech recognition based on transformer neural networks. In: 2021 International Conference on Information Science and Communications Technologies (ICISCT), Tashkent, Uzbekistan, 2021, pp. 1–5 (2021). https://doi.org/10.1109/ICISCT52966.2021.9670093
Mamatov, N., Niyozmatova, N., Samijonov, A.: Software for preprocessing voice signals. Int. J. Appl. Sci. Eng. 18, 2020163 (2021). https://doi.org/10.6703/IJASE.202103_18(1).006
Article Google Scholar
Narzillo, M., Abdurashid, S., Parakhat, N., Nilufar, N.: Automatic speaker identification by voice based on vector quantization method. Int. J. Innov. Technol. Explor. Eng. 8(10), 2443–2445 (2019). https://doi.org/10.35940/ijitee.J9523.0881019
Wiedecke, B., Narzillo, M., Payazov, M., Abdurashid, S.: Acoustic signal analysis and identification. Int. J. Innov. Technol. Explor. Eng. 8(10), 2440–2442 (2019). https://doi.org/10.35940/ijitee.J9522.0881019
Article Google Scholar
Narzillo, M., Abdurashid, S., Parakhat, N., Nilufar, N.: Karakalpak speech recognition with CMU sphinx. Int. J. Innov. Technol. Explor. Eng. 8(10), 2446–2448 (2019). https://doi.org/10.35940/ijitee.J9524.0881019
Article Google Scholar
Mosleh, M.: FPGA implementation of a linear systolic array for speech recognition based on HMM. In: Mosleh, M., Setayeshi, S., Mehdi Lotfinejad, M., Mirshekari, A. (eds.) The 2nd International Conference on Computer and Automation Engineering (ICCAE), vol. 3, pp. 75–78 (2010)
Google Scholar
Ikonin, S.Yu., Sarana, D.V.: SPIRIT ASP engine automatic speech recognition system. Digital Signal Processing (2003)
Google Scholar
Sapunov, G.V., Trufanov, F.A.: Genetic algorithms as a method for optimizing hidden Markov models in problems of speech recognition. In: Information Technologies in Computer Systems. Issue 3. Under the general editorship of prof. Azarova V.N. MIEM, Moscow (2004)
Google Scholar
Marczyk, A.: Genetic Algorithms and Evolutionary Computation (2004). http://www.talkorigins.org/faqs/genalg/genalg.html
Komarov, A.N.: Basic cellular ensembles of associative oscillatory environments and the possibility of their expansion. In: Komarov, A.N., Ognev, I.V., Podolin, P.B. (eds.) Computational systems and information processing technologies: Interuniversity. Sat. scientific tr. – Issue, vol. 5, no. 30, 200 p. Inf.-ed. center of PGU, Penza (2006)
Google Scholar
Ognev, I.V.: Character recognition in an associative oscillatory environment. In: Ognev, I.V., Podolin, P.B. (eds.) News of Higher Educational Institutions. Volga region. Ser. Technical Science, no. 6, pp. 55–66 (2006)
Google Scholar
Elliott, L., Ingham, D., Kyne, A., Mera, N., Pourkashanian, M., Whittaker, S.: Efficient clustering-based genetic algorithms in chemical kinetic modelling. In: Deb, K. (ed.) GECCO 2004. LNCS, vol. 3103, pp. 932–944. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24855-2_106
Chapter Google Scholar
Sastry, K., O’Reilly, U.M., Goldberg, D.E.: Population sizing for genetic programming based upon decision making. IlliGAL Report No. 2004028 (2004)
Google Scholar
Samijonov, A., Mamatov, N., Niyozmatova, N.A., Yuldoshev, Y., Asraev, M.: Gradient method for determining non-informative features on the basis of a homogeneous criterion with a positive degree. In: IOP Conference Series: Materials Science and Engineering, vol. 919, no. 4 (2020). https://doi.org/10.1088/1757-899X/919/4/042011
Mamatov, N., Niyozmatova, N.A., Samijonov, A., Juraev, S., Abdullayeva, B.: The choice of informative features based on heterogeneous functionals. In: IOP Conference Series: Materials Science and Engineering, vol. 919, no. 4 (2020). https://doi.org/10.1088/1757-899X/919/4/042009
Mamatov, N.S., Samijonov, A.N., Yuldoshev, Y., Khusan, R.: Selection the informative features on the basis of interrelationship of features. In: Techno-Societal 2018 - Proceedings of the 2nd International Conference on Advanced Technologies for Societal Applications, vol. 2, pp. 121–129 (2020). https://doi.org/10.1007/978-3-030-16962-6_13
Fazilov, S., Mamatov, N., Samijonov, A., Abdullaev, S.: Reducing the dimensionality of feature space in pattern recognition tasks. J. Phys. Conf. Ser. 1441(1), 012139 (2020). https://doi.org/10.1088/1742-6596/1441/1/012139
Article Google Scholar
Mamatov, N., Samijonov, A., Niyozmatova, N.: Determination of non-informative features based on the analysis of their relationships. J. Phys. Conf. Ser. 1441(1), 012149 (2020). https://doi.org/10.1088/1742-6596/1441/1/012149
Article Google Scholar
Niyozmatova, N.A., Mamatov, N., Samijonov, A., Mamadalieva, N., Abdullayeva, B.M.: Unconditional discrete optimization of linear-fractional function “-1”-order. In: IOP Conference Series: Materials Science and Engineering, vol. 862, no. 4, p. 042028 (2020). https://doi.org/10.1088/1757-899X/862/4/042028
Niyozmatova, N.A., Mamatov, N., Samijonov, A., Rahmonov, E., Juraev, S.: Method for selecting informative and non-informative features. In: IOP Conference Series: Materials Science and Engineering, vol. 919, no. 4 (2020). https://doi.org/10.1088/1757-899X/919/4/042013
Fazilov, S., Mamatov, N.: Formation an informative description of recognizable objects. J. Phys.: Conf. Ser. 1210(1) (2019). https://doi.org/10.1088/1742-6596/1210/1/012043
Mamatov, N., Samijonov, A., Yuldashev, Z.: Selection of features based on relationships. J. Phys. Conf. Ser. 1260(10), 102008 (2019). https://doi.org/10.1088/1742-6596/1260/10/102008
Article Google Scholar
Shavkat, F., Narzillo, M., Abdurashid, S.: Selection of significant features of objects in the classification data processing. Int. J. Recent Technol. Eng. 8(2 Special Issue 11), 3790–3794 (2019). https://doi.org/10.35940/ijrte.B1494.0982S1119
Mamatov, N., Samijonov, A., Yuldashev, Z., Niyozmatova, N.: Discrete optimization of linear fractional functionals. In: 2019 15th International Asian School-Seminar Optimization Problems of Complex Systems, OPCS 2019, pp. 96–99 ((2019)). https://doi.org/10.1109/OPCS.2019.8880208
Shavkat, F., Narzillo, M., Nilufar, N.: Developing methods and algorithms for forming of informative features’ space on the base K-types uniform criteria. Int. J. Recent Technol. Eng. 8(2 Special Issue 11), 3784–3786 (2019). https://doi.org/10.35940/ijrte.B1492.0982S1119
Nagy, P., Németh, G.: Improving HMM speech synthesis of interrogative sentences by pitch track transformations. Speech Commun. 82C(September 2016), 97–112 (2016). https://doi.org/10.1016/j.specom.2016.06.005
Daridi, F., Kharma, N., Salik, J.F.N.: Parameterless genetic algorithms: review and innovation. IEEE Can. Rev. (47) (2004)
Google Scholar
Aida-ZadeК, R.: Investigation of combined use of MFCC and LPC features in speech recognition systems. Aida-Zade, К.R., Ardil, C., Rustamov, S.S. (eds.) World Acadamic of Science, Engineering and Technology (2006)
Google Scholar
Noisy channel model (2020). https://en.wikipedia.org/wiki/Noisy_channel_model. Accessed 12 Apr 2020
Watanabe, S., et al.: Hybrid CTC. Attention Archit. End-to-End 11(8), 1240–1253 (2017)
Google Scholar
Hannun “Sequence Modeling with CTC”. Distill (2017). https://distill.pub/2017/ctc/
Graves, A., Fernandez, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: ICML (2006)
Google Scholar
Chan, W., et al.: Listen, attend and spell. arXiv:1508.01211 (2015)
Amodei, D., et al.: Deep Speech2: end-to-end speech recognition in English and Mandarin. arXiv:1512.02595 (2016)
Zeghidour, N., Usunier, N., Synnaeve, G., Collobert, R., Dupoux, E.: End-to-end speech recognition from the raw waveform. arXiv:1806.07098 (2018)

Download references

Author information

Authors and Affiliations

Digital Technologies and Artificial Intelligence, “Tashkent Institute of Irrigation and Agricultural Mechanization Engineers” National Research University, Tashkent, Uzbekistan
D. T. Muxamediyeva, N. A. Niyozmatova, R. A. Sobirov & E. Kh. Khamidov
Sejong University, Seoul, South Korea
B. N. Samijonov

Authors

D. T. Muxamediyeva
View author publications
You can also search for this author in PubMed Google Scholar
N. A. Niyozmatova
View author publications
You can also search for this author in PubMed Google Scholar
R. A. Sobirov
View author publications
You can also search for this author in PubMed Google Scholar
B. N. Samijonov
View author publications
You can also search for this author in PubMed Google Scholar
E. Kh. Khamidov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to D. T. Muxamediyeva .

Editor information

Editors and Affiliations

Ingenium Research Group, University of Castilla-La Mancha, Ciudad Real, Spain
Fausto Pedro García Márquez
National University of Computer and Emerging Sciences, Islamabad, Pakistan
Akhtar Jamil
Department of Computer Engineering, Istinye University, Istanbul, Türkiye
Alaa Ali Hameed
Ingenium Research Group, University of Castilla-La Mancha (UCLM), Ciudad Real, Spain
Isaac Segovia Ramírez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Muxamediyeva, D.T., Niyozmatova, N.A., Sobirov, R.A., Samijonov, B.N., Khamidov, E.K. (2024). Approaches to Solving Problems of Markov Modeling Training in Speech Recognition. In: García Márquez, F.P., Jamil, A., Hameed, A.A., Segovia Ramírez, I. (eds) Emerging Trends and Applications in Artificial Intelligence. ICETAI 2023. Lecture Notes in Networks and Systems, vol 960. Springer, Cham. https://doi.org/10.1007/978-3-031-56728-5_29

Download citation

DOI: https://doi.org/10.1007/978-3-031-56728-5_29
Published: 30 April 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56727-8
Online ISBN: 978-3-031-56728-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics