Correct Speech Visemes as a Root of Total Communication Method for Deaf People

  • Eva Pajorová
  • Ladislav Hluchý
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7327)


Many deaf people are using lip reading as a main communication fiorm. A viseme is a representational unit used to classify speech sounds in the visual domain and describes the particular facial and oral positions and movements that occur alongside the voicing of phonemes. A design tool for creating correct speech visemes is designed. It’s composed of 5 modules; one module for creating phonemes, one module for creating 3D speech visemes, one module for facial expression and modul for synchronization between phonemes and visemes and lastly one module to generate speech triphones. We are testing the correctness of generated visemes on Slovak speech domains. The paper descriebes our developed tool.


Facial Expression Sign Language Deaf Child Total Communication Speech Synthesis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hrúz, M., Krňoul, Z., Campr, P., Muller Ludek, M.S.: Tovards Automatic Annotation of Sing Language Dictionarz Corpora. In: TDS Pilsen, pp. 331–339 (2011)Google Scholar
  2. 2.
    Cai, Q., Gallup, D., Zhang, C., Zhang, Z.: 3D deformable face tracking with a commodity depth camera. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part III. LNCS, vol. 6313, pp. 229–242. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  3. 3.
    Drahoš, P., Šperka, M.: Face Expressions Animation in e-Learning,
  4. 4.
    Albrecht, J.: Haber, and H.-P. Seidel. Speech Synchronization for Physicsbased Facial Animation. In: Skala, V. (ed.) Proc. 10th Int. Conf. on Computer Graphics, Visualization and Computer Vision (WSCG 2002), pp. 9–16 (2002)Google Scholar
  5. 5.
    Ma, J., Cole, R., Pellom, B., Ward, W., Wise, B.: Accurate visible speech synthesis based on concatenating variable length motion capture data. IEEEGoogle Scholar
  6. 6.
    Wang, A., Emmi, M., Faloutsos, P.: Assembling an expressive facial animation system. In: Sandbox 2007: Proceedings of the 2007 ACM SIGGRAPH Symposium on Video Games, pp. 21–26. ACM Press (2007)Google Scholar
  7. 7.
    Ezzat, T., Geiger, G., Poggio, T.: Trainable videorealistic speech animation. In: SIGGRAPH 2002: Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques, pp. 388–398. ACM Press (2002)Google Scholar
  8. 8.
    Cohen, M.M., Massaro, D.W.: Modeling coarticulation in synthetic visual speech. In: Magnenat Thalmann, N., Thalmann, D. (eds.) Models and Techniques in Computer Animation, pp. 139–156. Springer, Tokyo (1994)Google Scholar
  9. 9.
    Brand, M.: Voice puppetry. In: SIGGRAPH 1999: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, pp. 21–28. ACM Press/Addison-Wesley Publishing Co. (1999)Google Scholar
  10. 10.
    Kim, I.-J., Ko., H.-S.: 3d lip-synch generation with data-faithful machine learning. Computer Graphics Forum. In: EUROGRAPHICS 2007, vol. 26(3) (2007)Google Scholar
  11. 11.
    Bregler, C., Covell, M., Slaney, M.: Video rewrite:driving visual speech with audio. In: Computer Graphics, Proc. SIGGRAPH 1997, pp. 67–74 (1997)Google Scholar
  12. 12.
    Meadow-Orlans, K.P., Mertens, D.M., Marilyn, S.-L.: Parents and their Deaf Children. Gallaudet University Press, Washington D.C. (2003)Google Scholar
  13. 13.
    Munoz-Baell, I.M., Ruiz, T.M.: Empowering the deaf; let the deaf be deaf. Journal Epidemiol Community Health 54(1), 40–44 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Eva Pajorová
    • 1
  • Ladislav Hluchý
    • 1
  1. 1.Institute of InformaticSlovak Academy of SciencesBratislavaSlovakia

Personalised recommendations