Speech Processing and Recognition System

Sen, Soumya; Dutta, Anjan; Dey, Nilanjan

doi:10.1007/978-981-13-6098-5_2

Soumya Sen⁴,
Anjan Dutta⁵ &
Nilanjan Dey⁶

Part of the book series: SpringerBriefs in Applied Sciences and Technology ((BRIEFSINTELL))

1087 Accesses
8 Citations

Abstract

In the initial decade of the twentieth century, scientists in the Bell System realized that the idea of universal services like telephony services is becoming feasible due to large-scale technological revolution [1].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kamm, C., Walker, M., & Rabiner, L. (1997). The role of speech processing in human–computer intelligent communication. Speech Communication, 23(4), 263–278.
Article Google Scholar
Retrieved July 08, 2018, from https://www.sciencedirect.com/topics/neuroscience/speech-processing.
Dey, N., & Ashour, A. S. (2018). Challenges and future perspectives in speech-sources direction of arrival estimation and localization. In Direction of arrival estimation and localization of multi-speech sources (pp. 49–52). Cham: Springer.
Google Scholar
Dey, N., & Ashour, A. S. (2018). Direction of arrival estimation and localization of multi-speech sources. Springer International Publishing.
Google Scholar
Dey, N., & Ashour, A. S. (2018). Applied examples and applications of localization and tracking problem of multiple speech sources. In Direction of arrival estimation and localization of multi-speech sources (pp. 35–48). Cham: Springer.
Google Scholar
Dey, N., & Ashour, A. S. (2018). Microphone array principles. In Direction of arrival estimation and localization of multi-speech sources (pp. 5–22). Cham: Springer.
Google Scholar
Kamal, M. S., Chowdhury, L., Khan, M. I., Ashour, A. S., Tavares, J. M. R., & Dey, N. (2017). Hidden Markov model and Chapman Kolmogrov for protein structures prediction from images. Computational Biology and Chemistry, 68, 231–244.
Article Google Scholar
Mahendru, H. C. (2014). Quick review of human speech production mechanism. International Journal of Engineering Research and Development, 9(10), 48–54.
Google Scholar
Shirodkar, N. S. (2016). Konkani Speech to Text Recognition using Hidden MARKOV Model Toolit (Masters dissertation). Retrieved July 08, 2018, from https://www.kom.aau.dk/group/04gr742/pdf/speech_production.pdf.
Retrieved July 08, 2018, from https://www.youtube.com/watch?v=Xjzm7S__kBU.
Sood, S., & Krishnamurthy, A. (2004, October). A robust on-the-fly pitch (OTFP) estimation algorithm. In Proceedings of the 12th Annual ACM International Conference on Multimedia (pp. 280–283). ACM.
Google Scholar
De Cheveigné, A., & Kawahara, H. (2002). YIN, a fundamental frequency estimator for speech and music. The Journal of the Acoustical Society of America, 111(4), 1917–1930.
Article Google Scholar
Chowdhury, S., Datta, A. K., & Chaudhuri, B. B. (2000). Pitch detection algorithm using state phase analysis. J Acoust Soc India, 28(1–4), 247–250.
Google Scholar
Yu, Y. (2012, March). Research on speech recognition technology and its application. In 2012 International Conference on Computer Science and Electronics Engineering (ICCSEE), (Vol. 1, pp. 306–309). IEEE.
Google Scholar
Retrieved July 20, 2018, from https://www.youtube.com/watch?v=q67z7PTGRi8&t=4294s.
Dey, N., Ashour, A. S., Mohamed, W. S., & Nguyen, N. G. (2019). Acoustic wave technology. In Acoustic sensors for biomedical applications (pp. 21–31). Cham: Springer.
Google Scholar
Dey, N., Ashour, A. S., Mohamed, W. S., & Nguyen, N. G. (2019). Acoustic sensors in biomedical applications. In Acoustic sensors for biomedical applications (pp. 43–47). Cham: Springer.
Google Scholar
Khiatani, D., & Ghose, U. (2017, October). Weather forecasting using hidden Markov model. In 2017 International Conference on Computing and Communication Technologies for Smart Nation (IC3TSN), (pp. 220–225). IEEE.
Google Scholar
Tokuda, K., Nankaku, Y., Toda, T., Zen, H., Yamagishi, J., & Oura, K. (2013). Speech synthesis based on hidden Markov models. Proceedings of the IEEE, 101(5), 1234–1252.
Article Google Scholar
Retrieved July 20, 2018, from https://www.youtube.com/watch?v=kNloj1Qtf0Y&t=1500s.
Gales, M., & Young, S. (2008). The application of hidden Markov models in speech recognition. Foundations and Trends® in Signal Processing, 1(3), 195–304.
Article Google Scholar
Rabiner, L. R., & Juang, B. H. (1992). Hidden Markov models for speech recognition—strengths and limitations. In Speech recognition and understanding (pp. 3–29). Heidelberg: Springer.
Chapter Google Scholar
Hore, S., Bhattacharya, T., Dey, N., Hassanien, A. E., Banerjee, A., & Chaudhuri, S. B. (2016). A real time dactylology based feature extraction for selective image encryption and artificial neural network. In Image feature detectors and descriptors (pp. 203–226). Cham: Springer.
Chapter Google Scholar
Samanta, S., Kundu, D., Chakraborty, S., Dey, N., Gaber, T., Hassanien, A. E., & Kim, T. H. (2015, September). Wooden Surface classification based on Haralick and the Neural Networks. In 2015 Fourth International Conference on Information Science and Industrial Applications (ISI), (pp. 33–39). IEEE.
Google Scholar
Kotyk, T., Ashour, A. S., Chakraborty, S., Dey, N., & Balas, V. E. (2015). Apoptosis analysis in classification paradigm: a neural network based approach. In Healthy World Conference (pp. 17–22).
Google Scholar
Agrawal, S., Singh, B., Kumar, R., & Dey, N. (2019). Machine learning for medical diagnosis: A neural network classifier optimized via the directed bee colony optimization algorithm. In U-Healthcare monitoring systems (pp. 197–215). Academic Press.
Google Scholar
Wang, Y., Chen, Y., Yang, N., Zheng, L., Dey, N., Ashour, A. S., … & Shi, F. (2018). Classification of mice hepatic granuloma microscopic images based on a deep convolutional neural network. Applied Soft Computing.
Google Scholar
Lan, K., Wang, D. T., Fong, S., Liu, L. S., Wong, K. K., & Dey, N. (2018). A survey of data mining and deep learning in bioinformatics. Journal of Medical Systems, 42(8), 139.
Article Google Scholar
Hu, S., Liu, M., Fong, S., Song, W., Dey, N., & Wong, R. (2018). Forecasting China future MNP by deep learning. In Behavior engineering and applications (pp. 169–210). Cham: Springer.
Google Scholar
Dey, N., Fong, S., Song, W., & Cho, K. (2017, August). Forecasting energy consumption from smart home sensor network by deep learning. In International Conference on Smart Trends for Information Technology and Computer Communications (pp. 255–265). Singapore: Springer.
Google Scholar
Dey, N., Ashour, A. S., & Nguyen, G. N. Recent advancement in multimedia content using deep learning.
Google Scholar
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533.
Article Google Scholar
Mohamed, A. R., Dahl, G. E., & Hinton, G. (2012). Acoustic modeling using deep belief networks. IEEE Transactions on Audio, Speech & Language Processing, 20(1), 14–22.
Article Google Scholar
Graves, A., Mohamed, A. R., & Hinton, G. (2013, May). Speech recognition with deep recurrent neural networks. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6645–6649). IEEE.
Google Scholar
Retrieved July 21, 2018, from https://medium.com/@ageitgey/machine-learning-is-fun-part-6-how-to-do-speech-recognition-with-deep-learning-28293c162f7a.
Browman, C. P., & Goldstein, L. (1992). Articulatory phonology: An overview. Phonetica, 49(3–4), 155–180.
Article Google Scholar
Livescu, K., Jyothi, P., & Fosler-Lussier, E. (2016). Articulatory feature-based pronunciation modeling. Computer Speech & Language, 36, 212–232.
Article Google Scholar
Retrieved July 22, 2018, from http://www.speech.sri.com/projects/srilm/.
Retrieved July 22, 2018, from https://kheafield.com/code/kenlm/.
Chen, S. F., & Goodman, J. (1999). An empirical study of smoothing techniques for language modeling. Computer Speech & Language, 13(4), 359–394.
Article Google Scholar
Retrieved July 24, 2018, from https://www.slideshare.net/ssrdigvijay88/ngrams-smoothing.
Retrieved July 24, 2018, from https://www.inf.ed.ac.uk/teaching/courses/asr/2011–12/asr-search-nup4.pdf.
Viterbi, A. (1967). Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, 13(2), 260–269.
Article Google Scholar
Gerber, M., Kaufmann, T., & Pfister, B. (2011, May). Extended Viterbi algorithm for optimized word HMMs. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4932–4935). IEEE.
Google Scholar

Download references

Author information

Authors and Affiliations

A.K. Choudhury School of Information Technology, University of Calcutta, Kolkata, West Bengal, India
Soumya Sen
Department of Information Technology, Techno India College of Technology, Kolkata, West Bengal, India
Anjan Dutta
Department of Information Technology, Techno India College of Technology, Kolkata, West Bengal, India
Nilanjan Dey

Authors

Soumya Sen
View author publications
You can also search for this author in PubMed Google Scholar
Anjan Dutta
View author publications
You can also search for this author in PubMed Google Scholar
Nilanjan Dey
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sen, S., Dutta, A., Dey, N. (2019). Speech Processing and Recognition System. In: Audio Processing and Speech Recognition. SpringerBriefs in Applied Sciences and Technology(). Springer, Singapore. https://doi.org/10.1007/978-981-13-6098-5_2

Download citation

DOI: https://doi.org/10.1007/978-981-13-6098-5_2
Published: 31 January 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6097-8
Online ISBN: 978-981-13-6098-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics