Abstract
Over five decades the scientists attempt to design machine that clearly transcripts the spoken words. Even though satisfactory accuracy is achieved, machines cannot recognize every voice, in any environment, from any speaker. In this paper we tackle the problem of robustness of Automatic Speech Recognition for isolated Macedonian speech in noisy environments. The goal is to exceed the problem of background noise type changing. Five different types of noise were artificially added to the audio recordings and the models were trained and evaluated for each one. The worst case scenario for the speech recognition systems turned out to be the babble noise, which in the higher levels of noise reaches 81.10% error rate. It is shown that as the noise increases the error rate is also increased and the model trained with clean speech, gives considerably better results in lower noise levels.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Schwarz, P.: Phoneme recognition based on long temporal context. PhD Thesis, Faculty of Information Technology. Department of Computer Graphics and Multimedia, Brno University of Technology 47–60 (2008)
Acero, A.: Acoustical and Environmental Robustness in Automatic Speech Recognition. Phd. Thesis, Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania (1990)
Sethy, S.N., Parthasarthy, S.: A split lexicon approach for improved recognition of spoken names. Integrated Media Systems Center, Department of Electrical Engineering-Systems, University of Southern California, Los Angeles, United States. AT&T Labs-Research, Florham Park. Speech Communications 48(9) (2006)
Liu, F., SternR., H., Huang, M., AceroA, X.: Efficient Cepstral Normalization for Robust Speech Recognition. Department of Electrical and Computer Engineering, Carnegie Mellon University (1992)
Kraljevski, I., Mihajlov, D., Gjorgjevik, D.: Hybrid HMM/ANN speech recognition system in Macedonian. Faculty of Electrical Engineering, St. Cirilus and Methodius, Skopje. Veterinary Institute, Skopje (2000); Краљевски, И., Михајлов, Д., Ѓорѓевиќ Д.: Хибриден HMM/ANN систем за препознавање на говор на македонски јазик. Електротехнички факултет, Универзитет Св. Кирил и Методиј, Скопје. Ветеринарен институт, Скопје (2000)
Gerazov, B., Ivanovski, Z., Labroska, V.: Modeling of the intonation structure of the Macedonian language on intonation phrases level. Faculty of Electrical engineering and Information technology, St. Cyrilus and Methodius University, Skopje. Intstitute of Macedonian Language Krste Misirkov, Skopje (2012)
Геразов, Б., Ивановски, З., Лаброска, В.: Моделирање на интонациската структура на македонскиотјазик на ниво на интонациски фрази. Институт за електроника, Факултет за електротехника и информацискитехнологии, Универзитет Св. Кирил и Методиј, Скопје. Институт за македонски јазик Крсте Мисирков, Скопје(2012)
Kumar, N., Andreou, A.G.: Heteroscedastic discriminant analysis and reduced rank hmms for improved speech recognition. Speech Communication 26, 283–297 (1998)
Compernolle, V.D.: Noise Adaptation in a Hidden Markov Model Speech Recognition System. Computer Speech and Language (1989)
Hermansky, H., Sharma, S.: Temporal patterns (traps) in asr of noisy speech. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSP), Phoenix, Arizona, USA (1999)
Jyh-Shing, R.J.: Audio Signal Processing and Recognition. CS Dept., Tsing Hua University, Taiwan (1996), http://neural.cs.nthu.edu.tw/jang/books/audiosignalprocessing/index.asp (accessed 2013)
Moreno, P.J.: Speech Recognition in Noisy Environments. Department of Electrical and Computer Engineering. Carnegie Mellon University (1996)
Sphinx – 4. Speech Recognizer written in JavaTM, http://cmusphinx.sourceforge.net/sphinx4/ (accessed 2013)
CMUSphinx Wiki. Document for the CMU Sphinx speech recognition engines, http://cmusphinx.sourceforge.net/wiki/ (accessed 2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Spasovski, D., Peshanski, G., Madjarov, G., Gjorgjevikj, D. (2015). Robustness of Speech Recognition System of Isolated Speech in Macedonian. In: Bogdanova, A., Gjorgjevikj, D. (eds) ICT Innovations 2014. ICT Innovations 2014. Advances in Intelligent Systems and Computing, vol 311. Springer, Cham. https://doi.org/10.1007/978-3-319-09879-1_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-09879-1_20
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09878-4
Online ISBN: 978-3-319-09879-1
eBook Packages: EngineeringEngineering (R0)