Skip to main content

Robustness of Speech Recognition System of Isolated Speech in Macedonian

  • Conference paper
ICT Innovations 2014 (ICT Innovations 2014)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 311))

Included in the following conference series:

  • 908 Accesses

Abstract

Over five decades the scientists attempt to design machine that clearly transcripts the spoken words. Even though satisfactory accuracy is achieved, machines cannot recognize every voice, in any environment, from any speaker. In this paper we tackle the problem of robustness of Automatic Speech Recognition for isolated Macedonian speech in noisy environments. The goal is to exceed the problem of background noise type changing. Five different types of noise were artificially added to the audio recordings and the models were trained and evaluated for each one. The worst case scenario for the speech recognition systems turned out to be the babble noise, which in the higher levels of noise reaches 81.10% error rate. It is shown that as the noise increases the error rate is also increased and the model trained with clean speech, gives considerably better results in lower noise levels.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Schwarz, P.: Phoneme recognition based on long temporal context. PhD Thesis, Faculty of Information Technology. Department of Computer Graphics and Multimedia, Brno University of Technology 47–60 (2008)

    Google Scholar 

  2. Acero, A.: Acoustical and Environmental Robustness in Automatic Speech Recognition. Phd. Thesis, Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania (1990)

    Google Scholar 

  3. Sethy, S.N., Parthasarthy, S.: A split lexicon approach for improved recognition of spoken names. Integrated Media Systems Center, Department of Electrical Engineering-Systems, University of Southern California, Los Angeles, United States. AT&T Labs-Research, Florham Park. Speech Communications 48(9) (2006)

    Google Scholar 

  4. Liu, F., SternR., H., Huang, M., AceroA, X.: Efficient Cepstral Normalization for Robust Speech Recognition. Department of Electrical and Computer Engineering, Carnegie Mellon University (1992)

    Google Scholar 

  5. Kraljevski, I., Mihajlov, D., Gjorgjevik, D.: Hybrid HMM/ANN speech recognition system in Macedonian. Faculty of Electrical Engineering, St. Cirilus and Methodius, Skopje. Veterinary Institute, Skopje (2000); Краљевски, И., Михајлов, Д., Ѓорѓевиќ Д.: Хибриден HMM/ANN систем за препознавање на говор на македонски јазик. Електротехнички факултет, Универзитет Св. Кирил и Методиј, Скопје. Ветеринарен институт, Скопје (2000)

    Google Scholar 

  6. Gerazov, B., Ivanovski, Z., Labroska, V.: Modeling of the intonation structure of the Macedonian language on intonation phrases level. Faculty of Electrical engineering and Information technology, St. Cyrilus and Methodius University, Skopje. Intstitute of Macedonian Language Krste Misirkov, Skopje (2012)

    Google Scholar 

  7. Геразов, Б., Ивановски, З., Лаброска, В.: Моделирање на интонациската структура на македонскиотјазик на ниво на интонациски фрази. Институт за електроника, Факултет за електротехника и информацискитехнологии, Универзитет Св. Кирил и Методиј, Скопје. Институт за македонски јазик Крсте Мисирков, Скопје(2012)

    Google Scholar 

  8. Kumar, N., Andreou, A.G.: Heteroscedastic discriminant analysis and reduced rank hmms for improved speech recognition. Speech Communication 26, 283–297 (1998)

    Article  Google Scholar 

  9. Compernolle, V.D.: Noise Adaptation in a Hidden Markov Model Speech Recognition System. Computer Speech and Language (1989)

    Google Scholar 

  10. Hermansky, H., Sharma, S.: Temporal patterns (traps) in asr of noisy speech. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSP), Phoenix, Arizona, USA (1999)

    Google Scholar 

  11. Jyh-Shing, R.J.: Audio Signal Processing and Recognition. CS Dept., Tsing Hua University, Taiwan (1996), http://neural.cs.nthu.edu.tw/jang/books/audiosignalprocessing/index.asp (accessed 2013)

  12. Moreno, P.J.: Speech Recognition in Noisy Environments. Department of Electrical and Computer Engineering. Carnegie Mellon University (1996)

    Google Scholar 

  13. Sphinx – 4. Speech Recognizer written in JavaTM, http://cmusphinx.sourceforge.net/sphinx4/ (accessed 2013)

  14. CMUSphinx Wiki. Document for the CMU Sphinx speech recognition engines, http://cmusphinx.sourceforge.net/wiki/ (accessed 2013)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Spasovski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Spasovski, D., Peshanski, G., Madjarov, G., Gjorgjevikj, D. (2015). Robustness of Speech Recognition System of Isolated Speech in Macedonian. In: Bogdanova, A., Gjorgjevikj, D. (eds) ICT Innovations 2014. ICT Innovations 2014. Advances in Intelligent Systems and Computing, vol 311. Springer, Cham. https://doi.org/10.1007/978-3-319-09879-1_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09879-1_20

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09878-4

  • Online ISBN: 978-3-319-09879-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics