Advertisement

Speech Processing and Recognition System

  • Soumya Sen
  • Anjan Dutta
  • Nilanjan Dey
Chapter
Part of the SpringerBriefs in Applied Sciences and Technology book series (BRIEFSAPPLSCIENCES)

Abstract

In the initial decade of the twentieth century, scientists in the Bell System realized that the idea of universal services like telephony services is becoming feasible due to large-scale technological revolution [1].

References

  1. 1.
    Kamm, C., Walker, M., & Rabiner, L. (1997). The role of speech processing in human–computer intelligent communication. Speech Communication, 23(4), 263–278.CrossRefGoogle Scholar
  2. 2.
  3. 3.
    Dey, N., & Ashour, A. S. (2018). Challenges and future perspectives in speech-sources direction of arrival estimation and localization. In Direction of arrival estimation and localization of multi-speech sources (pp. 49–52). Cham: Springer.Google Scholar
  4. 4.
    Dey, N., & Ashour, A. S. (2018). Direction of arrival estimation and localization of multi-speech sources. Springer International Publishing.Google Scholar
  5. 5.
    Dey, N., & Ashour, A. S. (2018). Applied examples and applications of localization and tracking problem of multiple speech sources. In Direction of arrival estimation and localization of multi-speech sources (pp. 35–48). Cham: Springer.Google Scholar
  6. 6.
    Dey, N., & Ashour, A. S. (2018). Microphone array principles. In Direction of arrival estimation and localization of multi-speech sources (pp. 5–22). Cham: Springer.Google Scholar
  7. 7.
    Kamal, M. S., Chowdhury, L., Khan, M. I., Ashour, A. S., Tavares, J. M. R., & Dey, N. (2017). Hidden Markov model and Chapman Kolmogrov for protein structures prediction from images. Computational Biology and Chemistry, 68, 231–244.CrossRefGoogle Scholar
  8. 8.
    Mahendru, H. C. (2014). Quick review of human speech production mechanism. International Journal of Engineering Research and Development, 9(10), 48–54.Google Scholar
  9. 9.
    Shirodkar, N. S. (2016). Konkani Speech to Text Recognition using Hidden MARKOV Model Toolit (Masters dissertation). Retrieved July 08, 2018, from https://www.kom.aau.dk/group/04gr742/pdf/speech_production.pdf.
  10. 10.
    Retrieved July 08, 2018, from https://www.youtube.com/watch?v=Xjzm7S__kBU.
  11. 11.
    Sood, S., & Krishnamurthy, A. (2004, October). A robust on-the-fly pitch (OTFP) estimation algorithm. In Proceedings of the 12th Annual ACM International Conference on Multimedia (pp. 280–283). ACM.Google Scholar
  12. 12.
    De Cheveigné, A., & Kawahara, H. (2002). YIN, a fundamental frequency estimator for speech and music. The Journal of the Acoustical Society of America, 111(4), 1917–1930.CrossRefGoogle Scholar
  13. 13.
    Chowdhury, S., Datta, A. K., & Chaudhuri, B. B. (2000). Pitch detection algorithm using state phase analysis. J Acoust Soc India, 28(1–4), 247–250.Google Scholar
  14. 14.
    Yu, Y. (2012, March). Research on speech recognition technology and its application. In 2012 International Conference on Computer Science and Electronics Engineering (ICCSEE), (Vol. 1, pp. 306–309). IEEE.Google Scholar
  15. 15.
  16. 16.
    Dey, N., Ashour, A. S., Mohamed, W. S., & Nguyen, N. G. (2019). Acoustic wave technology. In Acoustic sensors for biomedical applications (pp. 21–31). Cham: Springer.Google Scholar
  17. 17.
    Dey, N., Ashour, A. S., Mohamed, W. S., & Nguyen, N. G. (2019). Acoustic sensors in biomedical applications. In Acoustic sensors for biomedical applications (pp. 43–47). Cham: Springer.Google Scholar
  18. 18.
    Khiatani, D., & Ghose, U. (2017, October). Weather forecasting using hidden Markov model. In 2017 International Conference on Computing and Communication Technologies for Smart Nation (IC3TSN), (pp. 220–225). IEEE.Google Scholar
  19. 19.
    Tokuda, K., Nankaku, Y., Toda, T., Zen, H., Yamagishi, J., & Oura, K. (2013). Speech synthesis based on hidden Markov models. Proceedings of the IEEE, 101(5), 1234–1252.CrossRefGoogle Scholar
  20. 20.
  21. 21.
    Gales, M., & Young, S. (2008). The application of hidden Markov models in speech recognition. Foundations and Trends® in Signal Processing, 1(3), 195–304.CrossRefGoogle Scholar
  22. 22.
    Rabiner, L. R., & Juang, B. H. (1992). Hidden Markov models for speech recognition—strengths and limitations. In Speech recognition and understanding (pp. 3–29). Heidelberg: Springer.CrossRefGoogle Scholar
  23. 23.
    Hore, S., Bhattacharya, T., Dey, N., Hassanien, A. E., Banerjee, A., & Chaudhuri, S. B. (2016). A real time dactylology based feature extraction for selective image encryption and artificial neural network. In Image feature detectors and descriptors (pp. 203–226). Cham: Springer.CrossRefGoogle Scholar
  24. 24.
    Samanta, S., Kundu, D., Chakraborty, S., Dey, N., Gaber, T., Hassanien, A. E., & Kim, T. H. (2015, September). Wooden Surface classification based on Haralick and the Neural Networks. In 2015 Fourth International Conference on Information Science and Industrial Applications (ISI), (pp. 33–39). IEEE.Google Scholar
  25. 25.
    Kotyk, T., Ashour, A. S., Chakraborty, S., Dey, N., & Balas, V. E. (2015). Apoptosis analysis in classification paradigm: a neural network based approach. In Healthy World Conference (pp. 17–22).Google Scholar
  26. 26.
    Agrawal, S., Singh, B., Kumar, R., & Dey, N. (2019). Machine learning for medical diagnosis: A neural network classifier optimized via the directed bee colony optimization algorithm. In U-Healthcare monitoring systems (pp. 197–215). Academic Press.Google Scholar
  27. 27.
    Wang, Y., Chen, Y., Yang, N., Zheng, L., Dey, N., Ashour, A. S., … & Shi, F. (2018). Classification of mice hepatic granuloma microscopic images based on a deep convolutional neural network. Applied Soft Computing.Google Scholar
  28. 28.
    Lan, K., Wang, D. T., Fong, S., Liu, L. S., Wong, K. K., & Dey, N. (2018). A survey of data mining and deep learning in bioinformatics. Journal of Medical Systems, 42(8), 139.CrossRefGoogle Scholar
  29. 29.
    Hu, S., Liu, M., Fong, S., Song, W., Dey, N., & Wong, R. (2018). Forecasting China future MNP by deep learning. In Behavior engineering and applications (pp. 169–210). Cham: Springer.Google Scholar
  30. 30.
    Dey, N., Fong, S., Song, W., & Cho, K. (2017, August). Forecasting energy consumption from smart home sensor network by deep learning. In International Conference on Smart Trends for Information Technology and Computer Communications (pp. 255–265). Singapore: Springer.Google Scholar
  31. 31.
    Dey, N., Ashour, A. S., & Nguyen, G. N. Recent advancement in multimedia content using deep learning.Google Scholar
  32. 32.
    Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533.CrossRefGoogle Scholar
  33. 33.
    Mohamed, A. R., Dahl, G. E., & Hinton, G. (2012). Acoustic modeling using deep belief networks. IEEE Transactions on Audio, Speech & Language Processing, 20(1), 14–22.CrossRefGoogle Scholar
  34. 34.
    Graves, A., Mohamed, A. R., & Hinton, G. (2013, May). Speech recognition with deep recurrent neural networks. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6645–6649). IEEE.Google Scholar
  35. 35.
  36. 36.
    Browman, C. P., & Goldstein, L. (1992). Articulatory phonology: An overview. Phonetica, 49(3–4), 155–180.CrossRefGoogle Scholar
  37. 37.
    Livescu, K., Jyothi, P., & Fosler-Lussier, E. (2016). Articulatory feature-based pronunciation modeling. Computer Speech & Language, 36, 212–232.CrossRefGoogle Scholar
  38. 38.
    Retrieved July 22, 2018, from http://www.speech.sri.com/projects/srilm/.
  39. 39.
    Retrieved July 22, 2018, from https://kheafield.com/code/kenlm/.
  40. 40.
    Chen, S. F., & Goodman, J. (1999). An empirical study of smoothing techniques for language modeling. Computer Speech & Language, 13(4), 359–394.CrossRefGoogle Scholar
  41. 41.
  42. 42.
  43. 43.
    Viterbi, A. (1967). Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, 13(2), 260–269.CrossRefGoogle Scholar
  44. 44.
    Gerber, M., Kaufmann, T., & Pfister, B. (2011, May). Extended Viterbi algorithm for optimized word HMMs. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4932–4935). IEEE.Google Scholar

Copyright information

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Soumya Sen
    • 1
  • Anjan Dutta
    • 2
  • Nilanjan Dey
    • 3
  1. 1.A.K. Choudhury School of Information TechnologyUniversity of CalcuttaKolkataIndia
  2. 2.Department of Information TechnologyTechno India College of TechnologyKolkataIndia
  3. 3.Department of Information TechnologyTechno India College of TechnologyKolkataIndia

Personalised recommendations