Robustness of Speech Recognition System of Isolated Speech in Macedonian

Spasovski, Daniel; Peshanski, Goran; Madjarov, Gjorgji; Gjorgjevikj, Dejan

doi:10.1007/978-3-319-09879-1_20

Daniel Spasovski⁴,
Goran Peshanski⁴,
Gjorgji Madjarov⁵ &
…
Dejan Gjorgjevikj⁵

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 311))

Included in the following conference series:

International Conference on ICT Innovations

908 Accesses

Abstract

Over five decades the scientists attempt to design machine that clearly transcripts the spoken words. Even though satisfactory accuracy is achieved, machines cannot recognize every voice, in any environment, from any speaker. In this paper we tackle the problem of robustness of Automatic Speech Recognition for isolated Macedonian speech in noisy environments. The goal is to exceed the problem of background noise type changing. Five different types of noise were artificially added to the audio recordings and the models were trained and evaluated for each one. The worst case scenario for the speech recognition systems turned out to be the babble noise, which in the higher levels of noise reaches 81.10% error rate. It is shown that as the noise increases the error rate is also increased and the model trained with clean speech, gives considerably better results in lower noise levels.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Schwarz, P.: Phoneme recognition based on long temporal context. PhD Thesis, Faculty of Information Technology. Department of Computer Graphics and Multimedia, Brno University of Technology 47–60 (2008)
Google Scholar
Acero, A.: Acoustical and Environmental Robustness in Automatic Speech Recognition. Phd. Thesis, Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania (1990)
Google Scholar
Sethy, S.N., Parthasarthy, S.: A split lexicon approach for improved recognition of spoken names. Integrated Media Systems Center, Department of Electrical Engineering-Systems, University of Southern California, Los Angeles, United States. AT&T Labs-Research, Florham Park. Speech Communications 48(9) (2006)
Google Scholar
Liu, F., SternR., H., Huang, M., AceroA, X.: Efficient Cepstral Normalization for Robust Speech Recognition. Department of Electrical and Computer Engineering, Carnegie Mellon University (1992)
Google Scholar
Kraljevski, I., Mihajlov, D., Gjorgjevik, D.: Hybrid HMM/ANN speech recognition system in Macedonian. Faculty of Electrical Engineering, St. Cirilus and Methodius, Skopje. Veterinary Institute, Skopje (2000); Краљевски, И., Михајлов, Д., Ѓорѓевиќ Д.: Хибриден HMM/ANN систем за препознавање на говор на македонски јазик. Електротехнички факултет, Универзитет Св. Кирил и Методиј, Скопје. Ветеринарен институт, Скопје (2000)
Google Scholar
Gerazov, B., Ivanovski, Z., Labroska, V.: Modeling of the intonation structure of the Macedonian language on intonation phrases level. Faculty of Electrical engineering and Information technology, St. Cyrilus and Methodius University, Skopje. Intstitute of Macedonian Language Krste Misirkov, Skopje (2012)
Google Scholar
Геразов, Б., Ивановски, З., Лаброска, В.: Моделирање на интонациската структура на македонскиотјазик на ниво на интонациски фрази. Институт за електроника, Факултет за електротехника и информацискитехнологии, Универзитет Св. Кирил и Методиј, Скопје. Институт за македонски јазик Крсте Мисирков, Скопје(2012)
Google Scholar
Kumar, N., Andreou, A.G.: Heteroscedastic discriminant analysis and reduced rank hmms for improved speech recognition. Speech Communication 26, 283–297 (1998)
Article Google Scholar
Compernolle, V.D.: Noise Adaptation in a Hidden Markov Model Speech Recognition System. Computer Speech and Language (1989)
Google Scholar
Hermansky, H., Sharma, S.: Temporal patterns (traps) in asr of noisy speech. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSP), Phoenix, Arizona, USA (1999)
Google Scholar
Jyh-Shing, R.J.: Audio Signal Processing and Recognition. CS Dept., Tsing Hua University, Taiwan (1996), http://neural.cs.nthu.edu.tw/jang/books/audiosignalprocessing/index.asp (accessed 2013)
Moreno, P.J.: Speech Recognition in Noisy Environments. Department of Electrical and Computer Engineering. Carnegie Mellon University (1996)
Google Scholar
Sphinx – 4. Speech Recognizer written in Java^TM, http://cmusphinx.sourceforge.net/sphinx4/ (accessed 2013)
CMUSphinx Wiki. Document for the CMU Sphinx speech recognition engines, http://cmusphinx.sourceforge.net/wiki/ (accessed 2013)

Download references

Author information

Authors and Affiliations

Netcetera, Skopje, Macedonia
Daniel Spasovski & Goran Peshanski
Faculty of Computer Science and Engineering, Skopje, Macedonia
Gjorgji Madjarov & Dejan Gjorgjevikj

Authors

Daniel Spasovski
View author publications
You can also search for this author in PubMed Google Scholar
Goran Peshanski
View author publications
You can also search for this author in PubMed Google Scholar
Gjorgji Madjarov
View author publications
You can also search for this author in PubMed Google Scholar
Dejan Gjorgjevikj
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Spasovski .

Editor information

Editors and Affiliations

Faculty of Computer Science and Engineering, Ss Cyril and Methodius University, Skopje, Macedonia
Ana Madevska Bogdanova
Faculty of Computer Science and Engineering, Ss Cyril and Methodius University, Skopje, Macedonia
Dejan Gjorgjevikj

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Spasovski, D., Peshanski, G., Madjarov, G., Gjorgjevikj, D. (2015). Robustness of Speech Recognition System of Isolated Speech in Macedonian. In: Bogdanova, A., Gjorgjevikj, D. (eds) ICT Innovations 2014. ICT Innovations 2014. Advances in Intelligent Systems and Computing, vol 311. Springer, Cham. https://doi.org/10.1007/978-3-319-09879-1_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-09879-1_20
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09878-4
Online ISBN: 978-3-319-09879-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics