Abstract
Speech is essential to exchange information. Speech recognition is one of the interfaces for man-machine interaction. However, the performance of these systems is restricted to noisy acoustic conditions. Silent speech i.e. visual dynamic features of speech have more potential information for Human-Computer Interaction. This paper presents lip localization and segmentation by Otsu algorithm. The height and width parameters of lip movements are captured as visual cues for silent speech recognition. We develop stochastic visual word models with an in-house database of 20 subjects. Performance evaluation these models are measured by word error rate. The accuracy of the system recorded for speaker dependent female subjects is 84.6%, and 65.8% as an overall result.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Petajan, E.: Automatic lip reading to enhance speech recognition. In: IEEE Proceedings of Global Telecommunications Conference, Atlanta, GA, pp. 265–272 (1984)
Kandagal, A.P., Udayashankara, V.: Automatic bimodal audiovisual speech recognition a review. In: IEEE International Conference on Contemporary Computing and Informatics, Mysore, India, pp. 940–945 (2014). https://doi.org/10.1109/ic3i.2014.7019673
Tareque, M.H., Al Hasan, A.S.: Human lips-contour recognition and tracing. Int. J. Adv. Res. Artif. Intell. 3, 47–51 (2014)
Luettin, J., Thacker, N.A., Beet, S.W.: Speech reading using shape and intensity information. In: 4th International Conference on Speech and Language Processing, vol. 1, pp. 58–61 (1996)
Kass, M., Witkin, A., Terzopoulos, D.: Snakes active contour models. Int. J. Comput. Vis. 1(4), 321–331 (1988)
Hassanat, A.B.A., Jassim, S.: Color-based lip localization method. In: Proceedings of SPIE - The International Society for Optical Engineering (2010). https://doi.org/10.1117/12.850629
Matthews, I., Cootes, T.F., Bangham, J.A., Cox, S., Harvey, R.: Extraction of visual features for lip-reading. Trans. Pattern Anal. Mach. Intell. 24, 198–213 (2002)
Eveno, N., Caplier, A., Coulon, P.Y.: New color transformation for lips segmentation. In: 4th IEEE Workshop on Multimedia Signal Processing, pp. 3–8 (2001)
The MathWorks Inc.: MATLAB User Guide, vol. 4 (1998)
Gonzalez, R.C., Woods, R.E., Eddins, S.L.: Digital image processing using MATLAB, vol. 2. Gatesmark Publishing, Knoxville (2009)
Young, S., Evermann, G., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book. Cambridge University Engineering Department, Cambridge (2009)
Jun, H., Hua, Z.: Research on visual speech feature extraction. In: Proceedings of the International Conference on Computer Engineering and Technology, vol. 2, pp. 499–502 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kandagal, A.P., Udayashankara, V., Anusuya, M.A. (2018). Silent Speech Recognition. In: Nagabhushan, T., Aradhya, V.N.M., Jagadeesh, P., Shukla, S., M.L., C. (eds) Cognitive Computing and Information Processing. CCIP 2017. Communications in Computer and Information Science, vol 801. Springer, Singapore. https://doi.org/10.1007/978-981-10-9059-2_13
Download citation
DOI: https://doi.org/10.1007/978-981-10-9059-2_13
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-9058-5
Online ISBN: 978-981-10-9059-2
eBook Packages: Computer ScienceComputer Science (R0)