Abstract
Speech is the output of a time-varying excitation excited by a time-varying system. It generates pulses with fundamental frequency F0. This time-varying impulse trained as one of the features, characterized by fundamental frequency F0 and its formant frequencies. These features vary from one speaker to another speaker and from gender to gender also. In this paper, the accent issues in continuous speech recognition system are considered. Variations in F0 and formant frequencies are the main features that characterize variation in a speaker. The variation becomes very less within speaker, medium within the same accent, and very high among different accent. This variation in information can be exploited to recognize gender type and to improve performance of speech recognition system through modeling separate models based on gender type information. Five sentences are selected for training. Each of the sentences is spoken and recorded by five female speakers and five male speakers. The speech corpus will be preprocessed to identify the voiced and unvoiced region. The voiced region is the only region which carries information about F0. From each voiced segment, F0 is computed. Each forms the feature space labeled with the speaker identification: i.e., male or female. This information is used to parameterize the model for male and female. K-means algorithm is used during training as well as testing. Testing is conducted in two ways: speaker dependent testing and speaker independent testing. SPHINX-III software by Carnegie Mellon University has been used to measure the accuracy of speech recognition of data taking into account the case of gender separation which has been used in this research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Breazeal, C. and Aryanda, L. (2000),’Recognition of affective communicative intent in robot-directed speech,’ in’ Proceedings of Humanoids 2000.
http://www.ece.auckland.ac.nz/p4p_2005/archive/reports2003/pdfs/p60_hlai015.pdf.
http://www.ling.lu.se/persons/Suzi/downloads/RF_paper_SusanneS2004.pdf.
S. Davis and P. Mermelstein. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics Speech and Signal Processing, 28:357–366, Aug 1980.
Parris E. S., Carey M. I., Language Independent Gender Identification, Proceedings of IEEE ICASSP, pp 685–688, 1996.
Linde, Y., A. Buzo, and R.M. Gray, “An Algorithm for Vector Quantizer Design,”IEEE Trans. on Communication, 1980, COM-28(1), pp. 84–95.
Hartigan, J.A., Clustering Algorithm, 1975, New York, J. Wiley.
Gersho, A., “On the Structure of Vector Quantization,” IEEE Trans. on Information Theory, 1982, IT-28, pp. 256–261.
Richard P. Lappmann, Speech recognition by Machines and Humans, SPEECH Comm., pp. 1–15, 1997.
Santosh K. Gaikwad, Bharti W. Gawali and Pravin Yannawar, A Review on speech recognition technique, International Journal of Computer Application, Vol. 10(3), pp. 16–24, 2010.
M. Prabha, P. Viveka and Bharatha sreeja, Advanced gender recognition system using speech signal, IJSET, Vol.6(4), pp. 118–120, 2016.
Chetana Prakash and Suryakanth V Gangasetty, Fourier- Bessel based cepstral coefficient features for text-indipendent speaker identification, IICA, pp. 913–930, 2-11.
Musaed Alhussein, Zalfiqar Ali, Muhammad Imran and Wadood Abdul, Automatic gender detection based on characteristics of vocal folds for mobile healthcare system, Hindawi, pp. 1–12, 2016.
Suma Swamy and K. V Ramakrishnan, An efficient speech recognition system, CSEIJ, Vol3(4), pp. 21–27, 2013.
Preeti Saini and Parneet Kaur, Automatic speech recognition: A review, IJETT, Vol (2), pp. 132–136, 2013.
Bhupinder Singh, Neha Kapur and Puneet Kaur, Speech recognition with Hidden Markow model: A review, IJARCSSE, Vol. 2(3), pp. 400–403, 2012.
M.A Anusuya and S.K Katti, Speech recognition by Machine: A review, IJCSIS, Vol6(3), pp. 181–205, 2009.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Bhukya, S. (2019). Accent Issues in Continuous Speech Recognition System. In: Bapi, R., Rao, K., Prasad, M. (eds) First International Conference on Artificial Intelligence and Cognitive Computing . Advances in Intelligent Systems and Computing, vol 815. Springer, Singapore. https://doi.org/10.1007/978-981-13-1580-0_39
Download citation
DOI: https://doi.org/10.1007/978-981-13-1580-0_39
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1579-4
Online ISBN: 978-981-13-1580-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)