Abstract
The study of lips movements is relevant for a series of interesting applications in real world to enhance the communication means and in medical applications. In the present paper we illustrate a method we implemented with the purpose of helping Amyotrophic Lateral Schlerosys (ALS) patients to communicate, once the progress of the disease requires to intubate the patient and the voice is lost.
The Method uses several subsystems to carry out a so complex task and the results are really promising. However the method need to be improved in order to make the system more easy to use and more reliable in the prediction of pronounced words.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
RGB is a classification for the colors expressed in terms of the triple expressing the amount of the Red, Green and Blue colors, each ranging from 0 to 255.
- 2.
YIQ is the color space used by the NTSC color TV system used mainly in North America, Central America and Japan.
- 3.
A viseme is a generic facial image that can be used to describe a particular sound. A viseme is the visual equivalent of a phoneme or unit of sound in spoken language. Using visemes, the hearing-impaired can view sounds visually - effectively, “lip-reading” the entire human face.
- 4.
XT9 is a text predicting and correcting system for mobile devices with full keyboards. It is a successor to T9, a popular predictive text algorithm for mobile phones with only numeric pads.
References
Buchsbaum, W.H.: Color TV Servicing, 3rd edn. Prentice Hall, Englewood Cliffs (1975)
Magno Caldognetto, E., Zmarich, C., Cosi, P., Ferrero, F.: Italian consonantal visemes: Relationships between spatial/temporal articulatory characteristics and coproduced acoustic signal. In: Proceedings of AVSP-97, Tutorial and Research Workshop on Audio-Visual Speech Processing: Computational and Cognitive Science Approaches, Rhodes (Greece), pp. 5–8 (1997)
Canzler, U., Dziurzyk, T.: Extraction of non manual features for videobased sign language recognition. In: lAPK Workshop on Machine Vision Applications, MVA2002, Nara, Japan, pp. 318–321 (2002)
Cootes, T., Taylor, C., Cooper, D., Graham, J.: Active shape models-their training and application. Comput. Vis. Image Underst. 61, 61 (1995)
Gale, W.A., Church, K.W.: A program for aligning sentences in bilingual corpora. In: Proceedings of the 29th Annual Meeting on Association for Computational Linguistics, ACL 1991, Stroudsburg, PA, USA, pp. 177–184. Association for Computational Linguistics (1991)
Gervasi, O., Magni, R., Macellari, S.: A brain computer interface for enhancing the communication of people with severe impairment. In: Murgante, B., et al. (eds.) ICCSA 2014, Part VI. LNCS, vol. 8584, pp. 709–721. Springer, Heidelberg (2014)
Gervasi, O., Magni, R., Riganelli, M.: Mixed reality for improving tele-rehabilitation practices. In: Gervasi, O., Murgante, B., Misra, S., Gavrilova, M.L., Rocha, A.M.A.C., Torre, C., Taniar, D., Apduhan, B.O. (eds.) ICCSA 2015. LNCS, vol. 9155, pp. 569–580. Springer, Heidelberg (2015)
Gervasi, O., Magni, R., Zampolini, M.: Nu!rehavr: virtual reality in neuro tele-rehabilitation of patients with traumatic brain injury and stroke. Virtual Real. 14(2), 131–141 (2010)
Gervasi, O., Russo, D., Vella, F.: The aes implantation based on opencl for multi/many core architecture. In: Proceedings of the 2010 International Conference on Computational Science and Its Applications, ICCSA 2010, Washington, DC, USA, pp. 129–134. IEEE Computer Society (2010)
Pan, S.W.J., Guan, Y.: A new color transformation based fast outer lip contour extraction. J. Inform. Comput. Sci. 9(9), 2505–2514 (2012)
Kass, M., Witkin, A., Terzopoulos, D.: Snakes: active contour models. Int. J. Comput. Vis. 1(4), 321–331 (1988)
Kruskal, J.B.: An overview of sequence comparison. In: Sankoff, D., Kruskal, J.B. (eds.) Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, pp. 1–44. Addison-Wesley, Reading (1983)
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Sov. Phy. Dokl. 10, 707 (1966)
Lievin, M., Delmas, P., Coulon, P.Y., Luthon, F., Fristol, V.: Automatic lip tracking: Bayesian segmentation and active contours in a cooperative scheme. In: IEEE International Conference on Multimedia Computing and Systems, 1999, vol. 1, pp. 691–696, Jul 1999
Mahalanobis, P.C.: On the generalised distance in statistics. Proc. Natl. Inst. Sci. India 2(1), 49–55 (1936)
Saeed, U., Dugelay, J.-L.: Combining edge detection and region segmentation for lip contour extraction. In: Perales, F.J., Fisher, R.B. (eds.) AMDO 2010. LNCS, vol. 6169, pp. 11–20. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Gervasi, O., Magni, R., Ferri, M. (2016). A Method for Predicting Words by Interpreting Labial Movements. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2016. ICCSA 2016. Lecture Notes in Computer Science(), vol 9787. Springer, Cham. https://doi.org/10.1007/978-3-319-42108-7_34
Download citation
DOI: https://doi.org/10.1007/978-3-319-42108-7_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42107-0
Online ISBN: 978-3-319-42108-7
eBook Packages: Computer ScienceComputer Science (R0)