Abstract
Speech recognition is the ability of the machine to identify spoken words and classify them into appropriate category. First stage in the process of speech recognition is the extraction of appropriate features from the recorded words. We propose a novel algorithm for feature extraction using diatonic frequency cepstral coefficients. Diatonic frequencies are derived from a musical scale called as diatonic scale. The scale is based on harmonics of sound and models nonlinear behavior of human auditory filter. After feature extraction, the next classification stage uses a hybrid classifier using artificial neural network and fuzzy logic. If the difference between prediction values available at the output of the neural network is less, the classifier matches wrong patterns. Proposed algorithm overcomes this drawback using fuzzy logic. Proposed hybrid classifier improves the recognition rate significantly over existing classifiers. Test bed used in the experimentation focuses on Marathi language. It is the native language spoken in the state of Maharashtra.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Gupta D, Bansal P, Choudhary K (2018) The state of the art of feature extraction techniques in speech recognition. In: Agrawal S, Devi A, Wason R, Bansal P (eds) Speech and language processing for human-machine communications, vol 664. Advances in intelligent systems and computing. Springer, Singapore, pp 197–207
Lin Y, Abdulla WH (2015) Principles of psychoacoustics. Audio watermark. Springer, Cham, pp 15–49
Shanon BJ, Paliwal KK (2003) A comparative study of filter bank spacing for speech recognition. In: Microelectronic engineering research conference, Brisbane, pp 1–3
Hsieh SH, Lu CS, Pei SC (2013) Sparse fast fourier transform by downsampling. In: IEEE International conference on acoustics, Vancouver, pp 5637–5641
Bhavsar H, Trivedi J (2018) Image based sign language recognition using neuro fuzzy approach. Int J Sci Res Comput Sci, Eng Inform Technol, IJSRCSEIT 3:487–491
Gaikwad S, Gawali B, Mehrotra S (2013) Creation of Marathi speech corpus for automatic speech recognition. In: Conference on Asian spoken language research and evaluation (O-COCOSDA/CASLRE), Gurgaon, pp 1–5
Gedam YK, Magare SS, Dabhade AC, Deshmukh RR (2014) Development of automatic speech recognition of Marathi numerals. Int J Eng Innovative Technol (IJEIT) 3:198–203
Qasim M, Nawaz S, Hussain S, Habib T (2016) Urdu speech recognition system for district names of Pakistan. In: Conference of the oriental chapter of international committee for coordination and standardization of speech databases and assessment technique, Bali, pp 28–32
Wang D, Tang Z, Tang D, Chen Q (2016) A Chinese-English Mixlingual database and a speech recognition baseline. In: Conference of the oriental chapter of international committee for coordination and standardization of speech databases and assessment technique, Bali, pp 84–88
Li W, Hu X, Gravina R, Fortino G (2017) A neuro-fuzzy fatigue tracking and classification system for wheelchair users. IEEE Access 5:19420–19431
Diago L, Kitaoka T, Hagiwara I, Kambayashi T (2011) Neuro-fuzzy quantification of personal perceptions of facial images based on a limited dataset. IEEE Trans Neural Networks 22:2422–2432
Tailor JH, Shah DB (2018) HMM based light weight speech recognition system for gujarati language. In: Mishra D, Nayak M, Joshi A (eds) Information and communication technology for sustainable development. Lecture notes in networks and systems, vol 10. Springer, Singapore
Samudravijaya K, Ahuja R, Bondale N, Jose T, Krishnan S, Poddar P, Raveendran R (1998) A feature based hierarchical speech recognition system for Hindi. Sadhana. 23:313–340
Sneha V, Hardhika G, JeevaPriya K, Gupta D (2018) Isolated Kannada speech recognition using HTK-A detailed approach. In: Saeed K, Chaki N, Pati B, Bakshi S, Mohapatra D (eds) Process in advanced computing and intelligent engineering. Advances in intelligent systems and computing, vol 564. Springer, Singapore
Dalmiya CP, Dharun VS, Rajesh KP, (2013) An efficient method for tamil speech recognition using MFCC and DTW mobile applications. In: IEEE conference on information and communication technologies, Jeju Island, pp 1263–1268
Gaikwad S, Gawali B, Yannawar P (2010) A review on speech recognition technique. Int J Comput App 3:16–24
Ganoun A, Almerhag I (2012) Performance analysis of spoken arabic digits recognition techniques. J Electron Sci Technol 10:153–157
Jalil M, Butt FA, Malik A (2013) Short time energy, magnitude, zero crossing rate and autocorrelation measurement for discriminating voiced and unvoiced segments of speech signals. In: The international conference on technological advances in electrical, electronics and computer engineering (TAEECE), Konya, pp 208–212
Kondhalkar H, Mukherji P (2017) A database of Marathi numerals for speech data mining. Int J Adv Res Sci Eng 6:395–399
Bai Y, Wang D (2006) Fundamentals of fuzzy logic control-fuzzy sets, fuzzy rules and defuzzifications. In: Bai Y, Zhuang H, Wang D (eds) Advanced fuzzy logic technologies in industrial applications, advances in industrial control. Springer, London, pp 17–36
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Kondhalkar, H., Mukherji, P. (2019). Speech Recognition Using Novel Diatonic Frequency Cepstral Coefficients and Hybrid Neuro Fuzzy Classifier. In: Pandian, D., Fernando, X., Baig, Z., Shi, F. (eds) Proceedings of the International Conference on ISMAC in Computational Vision and Bio-Engineering 2018 (ISMAC-CVB). ISMAC 2018. Lecture Notes in Computational Vision and Biomechanics, vol 30. Springer, Cham. https://doi.org/10.1007/978-3-030-00665-5_76
Download citation
DOI: https://doi.org/10.1007/978-3-030-00665-5_76
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00664-8
Online ISBN: 978-3-030-00665-5
eBook Packages: EngineeringEngineering (R0)