Abstract
Donggan language, which is a special variant of Mandarin, is used by Donggan people in Central Asia. Donggan language includes Gansu dialect and Shaanxi dialect. This paper proposes a convolutional neural network (CNN) based Donggan language speech recognition method for the Donggan Shaanxi dialect. A text corpus and a pronunciation dictionary were designed for of Donggan Shannxi dialect and the corresponding speech corpus was recorded. Then the acoustic models of Donggan Shaanxi dialect was trained by CNN. Experimental results demonstrate that the recognition rate of proposed CNN-based method achieves lower word error rate than that of the monophonic hidden Markov model (HMM) based method, triphone HMM-based method and DNN- based method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lin, T.: Research on Donggan Language. China Social Science Press (2012)
Wang, S.: Survey and Research on the Chinese and Asian Donggan Dialect. Commercia (2015)
Furui, S.: History and development of speech recognition. In: Chen, F., Huggins (eds.) Speech Technology, pp. 1–18. Springer, Boston (2010). https://doi.org/10.1007/978-0-387-73819-2_1
Hai, J., Joo, E.M.: Improved linear predictive coding method for speech recognition. In: Conference on Joint Conference of the Fourth International Conference on Information, Communications & Signal Processing (2003)
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (2003)
Jing, Z., Qin, B.: DTW speech recognition algorithm of optimization template matching. In: World Automation Congress (2012)
Muda, L., Begam, M., Elamvazuthi, I.: Voice recognition algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) techniques. Ttps 2 (2010)
Omer, A.E.: Joint MFCC-and-vector quantization based text-independent speaker recognition system. In: International Conference on Communication (2017)
Hinton, G., Deng, L., Yu, D., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)
Nguyen, Q.B., Vu, T.T., Chi, M.L.: Improving acoustic model for English ASR System using deep neural network. In: IEEE Rivf International Conference on Computing & Communication Technologies-research (2015)
Hu, W., Fu, M., Pan, W.: Primi speech recognition based on deep neural network. In: IEEE International Conference on Intelligent Systems. IEEE (2016)
Karáfidt, M., Baskar, M.K., Veselý, K., et al.: Analysis of multilingual BLSTM acoustic model on low and high resource languages. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5789–5793. IEEE (2018)
Abdel-Hamid, O., Mohamed, A., Jiang, H., et al.: Convolutional neural networks for speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(10), 1533–1545 (2014)
Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. arXiv preprint arXiv:1301.3557 (2013)
Rashmi, S., Hanumanthappa, M., Reddy, M.V.: Hidden Markov Model for speech recognition system—a pilot study and a naive approach for speech-to-text model. In: Agrawal, S.S., Devi, A., Wason, R., Bansal, P. (eds.) Speech and Language Processing for Human-Machine Communications. AISC, vol. 664, pp. 77–90. Springer, Singapore (2018). https://doi.org/10.1007/978-981-10-6626-9_9
Dighe, P., Luyet, G., Asaei, A., et al.: Exploiting low-dimensional structures to enhance DNN based acoustic modeling in speech recognition. In: IEEE International Conference on Acoustics (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Xu, H., You, Y., Yang, H. (2019). Donggan Speech Recognition Based on Convolution Neural Networks. In: Cheng, X., Jing, W., Song, X., Lu, Z. (eds) Data Science. ICPCSEE 2019. Communications in Computer and Information Science, vol 1058. Springer, Singapore. https://doi.org/10.1007/978-981-15-0118-0_44
Download citation
DOI: https://doi.org/10.1007/978-981-15-0118-0_44
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-0117-3
Online ISBN: 978-981-15-0118-0
eBook Packages: Computer ScienceComputer Science (R0)