Skip to main content

Accent Issues in Continuous Speech Recognition System

  • Conference paper
  • First Online:
First International Conference on Artificial Intelligence and Cognitive Computing

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 815))

  • 833 Accesses

Abstract

Speech is the output of a time-varying excitation excited by a time-varying system. It generates pulses with fundamental frequency F0. This time-varying impulse trained as one of the features, characterized by fundamental frequency F0 and its formant frequencies. These features vary from one speaker to another speaker and from gender to gender also. In this paper, the accent issues in continuous speech recognition system are considered. Variations in F0 and formant frequencies are the main features that characterize variation in a speaker. The variation becomes very less within speaker, medium within the same accent, and very high among different accent. This variation in information can be exploited to recognize gender type and to improve performance of speech recognition system through modeling separate models based on gender type information. Five sentences are selected for training. Each of the sentences is spoken and recorded by five female speakers and five male speakers. The speech corpus will be preprocessed to identify the voiced and unvoiced region. The voiced region is the only region which carries information about F0. From each voiced segment, F0 is computed. Each forms the feature space labeled with the speaker identification: i.e., male or female. This information is used to parameterize the model for male and female. K-means algorithm is used during training as well as testing. Testing is conducted in two ways: speaker dependent testing and speaker independent testing. SPHINX-III software by Carnegie Mellon University has been used to measure the accuracy of speech recognition of data taking into account the case of gender separation which has been used in this research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Breazeal, C. and Aryanda, L. (2000),’Recognition of affective communicative intent in robot-directed speech,’ in’ Proceedings of Humanoids 2000.

    Google Scholar 

  2. http://www.ece.auckland.ac.nz/p4p_2005/archive/reports2003/pdfs/p60_hlai015.pdf.

  3. http://www.ling.lu.se/persons/Suzi/downloads/RF_paper_SusanneS2004.pdf.

  4. S. Davis and P. Mermelstein. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics Speech and Signal Processing, 28:357–366, Aug 1980.

    Article  Google Scholar 

  5. Parris E. S., Carey M. I., Language Independent Gender Identification, Proceedings of IEEE ICASSP, pp 685–688, 1996.

    Google Scholar 

  6. Linde, Y., A. Buzo, and R.M. Gray, “An Algorithm for Vector Quantizer Design,”IEEE Trans. on Communication, 1980, COM-28(1), pp. 84–95.

    Article  Google Scholar 

  7. Hartigan, J.A., Clustering Algorithm, 1975, New York, J. Wiley.

    Google Scholar 

  8. Gersho, A., “On the Structure of Vector Quantization,” IEEE Trans. on Information Theory, 1982, IT-28, pp. 256–261.

    Google Scholar 

  9. Richard P. Lappmann, Speech recognition by Machines and Humans, SPEECH Comm., pp. 1–15, 1997.

    Google Scholar 

  10. Santosh K. Gaikwad, Bharti W. Gawali and Pravin Yannawar, A Review on speech recognition technique, International Journal of Computer Application, Vol. 10(3), pp. 16–24, 2010.

    Article  Google Scholar 

  11. M. Prabha, P. Viveka and Bharatha sreeja, Advanced gender recognition system using speech signal, IJSET, Vol.6(4), pp. 118–120, 2016.

    Google Scholar 

  12. Chetana Prakash and Suryakanth V Gangasetty, Fourier- Bessel based cepstral coefficient features for text-indipendent speaker identification, IICA, pp. 913–930, 2-11.

    Google Scholar 

  13. Musaed Alhussein, Zalfiqar Ali, Muhammad Imran and Wadood Abdul, Automatic gender detection based on characteristics of vocal folds for mobile healthcare system, Hindawi, pp. 1–12, 2016.

    Google Scholar 

  14. Suma Swamy and K. V Ramakrishnan, An efficient speech recognition system, CSEIJ, Vol3(4), pp. 21–27, 2013.

    Google Scholar 

  15. Preeti Saini and Parneet Kaur, Automatic speech recognition: A review, IJETT, Vol (2), pp. 132–136, 2013.

    Google Scholar 

  16. Bhupinder Singh, Neha Kapur and Puneet Kaur, Speech recognition with Hidden Markow model: A review, IJARCSSE, Vol. 2(3), pp. 400–403, 2012.

    Google Scholar 

  17. M.A Anusuya and S.K Katti, Speech recognition by Machine: A review, IJCSIS, Vol6(3), pp. 181–205, 2009.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sreedhar Bhukya .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bhukya, S. (2019). Accent Issues in Continuous Speech Recognition System. In: Bapi, R., Rao, K., Prasad, M. (eds) First International Conference on Artificial Intelligence and Cognitive Computing . Advances in Intelligent Systems and Computing, vol 815. Springer, Singapore. https://doi.org/10.1007/978-981-13-1580-0_39

Download citation

Publish with us

Policies and ethics