Lip-Reading Technique Using Spatio-Temporal Templates and Support Vector Machines

  • Wai Chee Yau
  • Dinesh Kant Kumar
  • Tharangini Chinnadurai
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5197)


This paper presents a lip-reading technique to identify the unspoken phones using support vector machines. The proposed system is based on temporal integration of the video data to generate spatio-temporal templates (STT). 64 Zernike moments (ZM) are extracted from each STT. This work proposes a novel feature selection algorithm to reduce the dimensionality of the 64 ZM to 12 features. The proposed technique uses the shape of probability curve as a goodness measure for optimal feature selection. The feature vectors are classified using non-linear support vector machines.Such a system could be invaluable when it is important to communicate without making a sound, such as giving passwords when in public spaces.


visual speech recognition motion segmentation feature selection Zernike moments support vector machines 


  1. 1.
    Petajan, E.D.: Automatic Lip-reading to Enhance Speech Recognition. In: GLOBECOM 1984 (1984)Google Scholar
  2. 2.
    Kaynak, M.N., Qi, Z., Cheok, A.D., Sengupta, K., Chung, K.C.: Audio-visual modeling for bimodal speech recognition. IEEE Transactions on Systems, Man and Cybernetics 34, 564–570 (2001)CrossRefGoogle Scholar
  3. 3.
    Potamianos, G., Neti, C., Gravier, G., Garg, A., Senior, A.W.: Recent Advances in Automatic Recognition of Audio-Visual Speech. Proc. of IEEE (2003)Google Scholar
  4. 4.
    Potamianos, G., Neti, C., Huang, J., Connell, J.H., Chu, S., Libal, V., Marcheret, E., Haas, N., Jiang, J.: Towards Practical Deployment of Audio-Visual Speech Recognition. In: ICASSP. IEEE, Los Alamitos (2004)Google Scholar
  5. 5.
    Jourlin, P., Luettin, J., Genoud, D., Wassner, H.: Acoustic-Labial Speaker Verification. In: Bigün, J., Borgefors, G., Chollet, G. (eds.) AVBPA 1997. LNCS, vol. 1206, pp. 319–326. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  6. 6.
    Faraj, M., Bigun, J.: Synergy of lip motion and acoustic features in biometric speech and speaker recognition. IEEE trans. Computers 56(9), 1169–1175 (2007)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Bobick, A.F., Davis, J.W.: The Recognition of Human Movement Using Temporal Templates. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 257–267 (2001)CrossRefGoogle Scholar
  8. 8.
    Yau, W.C., Kumar, D.K., Arjunan, S.P.: Visual Recognition of Speech Consonants using Facial Movement Features. Integrated Computer-Aided Engineering 14(1), 9–61 (2007)Google Scholar
  9. 9.
    Zhang, D., Lu, G.: Review of Shape Representation and Description Techniques. Pattern Recognition Letters 37 (2004)Google Scholar
  10. 10.
    Teh, C.H., Chin, R.T.: On Image Analysis by the Methods of Moments. IEEE Transactions on Pattern Analysis and Machine Intelligence 10, 496–513 (1988)CrossRefzbMATHGoogle Scholar
  11. 11.
    Khontazad, A., Hong, Y.H.: Invariant Image Recognition by Zernike Moments. IEEE Transactions on Pattern Analysis and Machine Intelligence 12, 489–497 (1990)CrossRefGoogle Scholar
  12. 12.
    Yu, L., Liu, H.: Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution. In: ICML (2003)Google Scholar
  13. 13.
    Cataltepe, Z., Aygun, E., Filiz, A., Keskin, O., Komurlu, C., Altunbasak, Y.: Dimensionality Reduction for Protein Function Prediction. In: Automated Function Prediction(AFP)/ Biosapiens Joint Special Interest Group Meeting, Vienna, Austria (2007)Google Scholar
  14. 14.
    Burges, C.J.C.: A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery 2(2), 955–974 (1998)CrossRefGoogle Scholar
  15. 15.
    Foo, S.W., Dong, L.: Recognition of visual speech elements using hidden Markov models. Lecture notes in computer science. Springer, Heidelberg (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Wai Chee Yau
    • 1
  • Dinesh Kant Kumar
    • 1
  • Tharangini Chinnadurai
    • 1
  1. 1.School of Electrical and Computer EngineeringRMIT UniversityAustralia

Personalised recommendations