Accent Issues in Continuous Speech Recognition System

Bhukya, Sreedhar

doi:10.1007/978-981-13-1580-0_39

Sreedhar Bhukya¹⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 815))

833 Accesses

Abstract

Speech is the output of a time-varying excitation excited by a time-varying system. It generates pulses with fundamental frequency F0. This time-varying impulse trained as one of the features, characterized by fundamental frequency F0 and its formant frequencies. These features vary from one speaker to another speaker and from gender to gender also. In this paper, the accent issues in continuous speech recognition system are considered. Variations in F0 and formant frequencies are the main features that characterize variation in a speaker. The variation becomes very less within speaker, medium within the same accent, and very high among different accent. This variation in information can be exploited to recognize gender type and to improve performance of speech recognition system through modeling separate models based on gender type information. Five sentences are selected for training. Each of the sentences is spoken and recorded by five female speakers and five male speakers. The speech corpus will be preprocessed to identify the voiced and unvoiced region. The voiced region is the only region which carries information about F0. From each voiced segment, F0 is computed. Each forms the feature space labeled with the speaker identification: i.e., male or female. This information is used to parameterize the model for male and female. K-means algorithm is used during training as well as testing. Testing is conducted in two ways: speaker dependent testing and speaker independent testing. SPHINX-III software by Carnegie Mellon University has been used to measure the accuracy of speech recognition of data taking into account the case of gender separation which has been used in this research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Breazeal, C. and Aryanda, L. (2000),’Recognition of affective communicative intent in robot-directed speech,’ in’ Proceedings of Humanoids 2000.
Google Scholar
http://www.ece.auckland.ac.nz/p4p_2005/archive/reports2003/pdfs/p60_hlai015.pdf.
http://www.ling.lu.se/persons/Suzi/downloads/RF_paper_SusanneS2004.pdf.
S. Davis and P. Mermelstein. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics Speech and Signal Processing, 28:357–366, Aug 1980.
Article Google Scholar
Parris E. S., Carey M. I., Language Independent Gender Identification, Proceedings of IEEE ICASSP, pp 685–688, 1996.
Google Scholar
Linde, Y., A. Buzo, and R.M. Gray, “An Algorithm for Vector Quantizer Design,”IEEE Trans. on Communication, 1980, COM-28(1), pp. 84–95.
Article Google Scholar
Hartigan, J.A., Clustering Algorithm, 1975, New York, J. Wiley.
Google Scholar
Gersho, A., “On the Structure of Vector Quantization,” IEEE Trans. on Information Theory, 1982, IT-28, pp. 256–261.
Google Scholar
Richard P. Lappmann, Speech recognition by Machines and Humans, SPEECH Comm., pp. 1–15, 1997.
Google Scholar
Santosh K. Gaikwad, Bharti W. Gawali and Pravin Yannawar, A Review on speech recognition technique, International Journal of Computer Application, Vol. 10(3), pp. 16–24, 2010.
Article Google Scholar
M. Prabha, P. Viveka and Bharatha sreeja, Advanced gender recognition system using speech signal, IJSET, Vol.6(4), pp. 118–120, 2016.
Google Scholar
Chetana Prakash and Suryakanth V Gangasetty, Fourier- Bessel based cepstral coefficient features for text-indipendent speaker identification, IICA, pp. 913–930, 2-11.
Google Scholar
Musaed Alhussein, Zalfiqar Ali, Muhammad Imran and Wadood Abdul, Automatic gender detection based on characteristics of vocal folds for mobile healthcare system, Hindawi, pp. 1–12, 2016.
Google Scholar
Suma Swamy and K. V Ramakrishnan, An efficient speech recognition system, CSEIJ, Vol3(4), pp. 21–27, 2013.
Google Scholar
Preeti Saini and Parneet Kaur, Automatic speech recognition: A review, IJETT, Vol (2), pp. 132–136, 2013.
Google Scholar
Bhupinder Singh, Neha Kapur and Puneet Kaur, Speech recognition with Hidden Markow model: A review, IJARCSSE, Vol. 2(3), pp. 400–403, 2012.
Google Scholar
M.A Anusuya and S.K Katti, Speech recognition by Machine: A review, IJCSIS, Vol6(3), pp. 181–205, 2009.
Google Scholar

Download references

Author information

Authors and Affiliations

Speech and Vision Laboratory, IIIT-Hyderabad, Hyderabad, India
Sreedhar Bhukya

Authors

Sreedhar Bhukya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sreedhar Bhukya .

Editor information

Editors and Affiliations

School of Computer and Information Sciences, University of Hyderabad, Hyderabad, Telangana, India
Raju Surampudi Bapi
Department Computer Science and Engineering, MLR Institute of Technology, Hyderabad, Telangana, India
Koppula Srinivas Rao
IDRBT, Hyderabad, Telangana, India
Munaga V. N. K. Prasad

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bhukya, S. (2019). Accent Issues in Continuous Speech Recognition System. In: Bapi, R., Rao, K., Prasad, M. (eds) First International Conference on Artificial Intelligence and Cognitive Computing . Advances in Intelligent Systems and Computing, vol 815. Springer, Singapore. https://doi.org/10.1007/978-981-13-1580-0_39

Download citation

DOI: https://doi.org/10.1007/978-981-13-1580-0_39
Published: 05 November 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1579-4
Online ISBN: 978-981-13-1580-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics