A Method for Predicting Words by Interpreting Labial Movements

Gervasi, Osvaldo; Magni, Riccardo; Ferri, Matteo

doi:10.1007/978-3-319-42108-7_34

Osvaldo Gervasi²²,
Riccardo Magni²³ &
Matteo Ferri²⁴

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9787))

Included in the following conference series:

International Conference on Computational Science and Its Applications

1487 Accesses
6 Citations
1 Altmetric

Abstract

The study of lips movements is relevant for a series of interesting applications in real world to enhance the communication means and in medical applications. In the present paper we illustrate a method we implemented with the purpose of helping Amyotrophic Lateral Schlerosys (ALS) patients to communicate, once the progress of the disease requires to intubate the patient and the voice is lost.

The Method uses several subsystems to carry out a so complex task and the results are really promising. However the method need to be improved in order to make the system more easy to use and more reliable in the prediction of pronounced words.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
RGB is a classification for the colors expressed in terms of the triple expressing the amount of the Red, Green and Blue colors, each ranging from 0 to 255.
2.
YIQ is the color space used by the NTSC color TV system used mainly in North America, Central America and Japan.
3.
A viseme is a generic facial image that can be used to describe a particular sound. A viseme is the visual equivalent of a phoneme or unit of sound in spoken language. Using visemes, the hearing-impaired can view sounds visually - effectively, “lip-reading” the entire human face.
4.
XT9 is a text predicting and correcting system for mobile devices with full keyboards. It is a successor to T9, a popular predictive text algorithm for mobile phones with only numeric pads.

References

Buchsbaum, W.H.: Color TV Servicing, 3rd edn. Prentice Hall, Englewood Cliffs (1975)
Google Scholar
Magno Caldognetto, E., Zmarich, C., Cosi, P., Ferrero, F.: Italian consonantal visemes: Relationships between spatial/temporal articulatory characteristics and coproduced acoustic signal. In: Proceedings of AVSP-97, Tutorial and Research Workshop on Audio-Visual Speech Processing: Computational and Cognitive Science Approaches, Rhodes (Greece), pp. 5–8 (1997)
Google Scholar
Canzler, U., Dziurzyk, T.: Extraction of non manual features for videobased sign language recognition. In: lAPK Workshop on Machine Vision Applications, MVA2002, Nara, Japan, pp. 318–321 (2002)
Google Scholar
Cootes, T., Taylor, C., Cooper, D., Graham, J.: Active shape models-their training and application. Comput. Vis. Image Underst. 61, 61 (1995)
Article Google Scholar
Gale, W.A., Church, K.W.: A program for aligning sentences in bilingual corpora. In: Proceedings of the 29th Annual Meeting on Association for Computational Linguistics, ACL 1991, Stroudsburg, PA, USA, pp. 177–184. Association for Computational Linguistics (1991)
Google Scholar
Gervasi, O., Magni, R., Macellari, S.: A brain computer interface for enhancing the communication of people with severe impairment. In: Murgante, B., et al. (eds.) ICCSA 2014, Part VI. LNCS, vol. 8584, pp. 709–721. Springer, Heidelberg (2014)
Google Scholar
Gervasi, O., Magni, R., Riganelli, M.: Mixed reality for improving tele-rehabilitation practices. In: Gervasi, O., Murgante, B., Misra, S., Gavrilova, M.L., Rocha, A.M.A.C., Torre, C., Taniar, D., Apduhan, B.O. (eds.) ICCSA 2015. LNCS, vol. 9155, pp. 569–580. Springer, Heidelberg (2015)
Chapter Google Scholar
Gervasi, O., Magni, R., Zampolini, M.: Nu!rehavr: virtual reality in neuro tele-rehabilitation of patients with traumatic brain injury and stroke. Virtual Real. 14(2), 131–141 (2010)
Article Google Scholar
Gervasi, O., Russo, D., Vella, F.: The aes implantation based on opencl for multi/many core architecture. In: Proceedings of the 2010 International Conference on Computational Science and Its Applications, ICCSA 2010, Washington, DC, USA, pp. 129–134. IEEE Computer Society (2010)
Google Scholar
Pan, S.W.J., Guan, Y.: A new color transformation based fast outer lip contour extraction. J. Inform. Comput. Sci. 9(9), 2505–2514 (2012)
Google Scholar
Kass, M., Witkin, A., Terzopoulos, D.: Snakes: active contour models. Int. J. Comput. Vis. 1(4), 321–331 (1988)
Article MATH Google Scholar
Kruskal, J.B.: An overview of sequence comparison. In: Sankoff, D., Kruskal, J.B. (eds.) Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, pp. 1–44. Addison-Wesley, Reading (1983)
Google Scholar
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Sov. Phy. Dokl. 10, 707 (1966)
MathSciNet MATH Google Scholar
Lievin, M., Delmas, P., Coulon, P.Y., Luthon, F., Fristol, V.: Automatic lip tracking: Bayesian segmentation and active contours in a cooperative scheme. In: IEEE International Conference on Multimedia Computing and Systems, 1999, vol. 1, pp. 691–696, Jul 1999
Google Scholar
Mahalanobis, P.C.: On the generalised distance in statistics. Proc. Natl. Inst. Sci. India 2(1), 49–55 (1936)
MathSciNet MATH Google Scholar
Saeed, U., Dugelay, J.-L.: Combining edge detection and region segmentation for lip contour extraction. In: Perales, F.J., Fisher, R.B. (eds.) AMDO 2010. LNCS, vol. 6169, pp. 11–20. Springer, Heidelberg (2010)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, University of Perugia, Perugia, Italy
Osvaldo Gervasi
Pragma Engineering SrL, Perugia, Italy
Riccardo Magni
Prometeia SpA, Bologna, Italy
Matteo Ferri

Authors

Osvaldo Gervasi
View author publications
You can also search for this author in PubMed Google Scholar
Riccardo Magni
View author publications
You can also search for this author in PubMed Google Scholar
Matteo Ferri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Osvaldo Gervasi .

Editor information

Editors and Affiliations

University of Perugia , Perugia, Italy
Osvaldo Gervasi
University of Basilicata , Potenza, Italy
Beniamino Murgante
Covenant University , Ota, Nigeria
Sanjay Misra
University of Minho , Braga, Portugal
Ana Maria A.C. Rocha
Polytechnic University , Bari, Italy
Carmelo M. Torre
Monash University , Clayton, Victoria, Australia
David Taniar
Kyushu Sangyo University , Fukuoka, Japan
Bernady O. Apduhan
Saint Petersburg State University , Saint Petersburg, Russia
Elena Stankova
Beijing University of Posts & Telecommunication , Beijing, China
Shangguang Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gervasi, O., Magni, R., Ferri, M. (2016). A Method for Predicting Words by Interpreting Labial Movements. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2016. ICCSA 2016. Lecture Notes in Computer Science(), vol 9787. Springer, Cham. https://doi.org/10.1007/978-3-319-42108-7_34

Download citation

DOI: https://doi.org/10.1007/978-3-319-42108-7_34
Published: 12 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42107-0
Online ISBN: 978-3-319-42108-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics