An Improvement of Basic Mouth Shape Detection Rate from Japanese Utterance Image Sequence Using Optical Flow

Miyazaki, Tsuyoshi; Nakashima, Toyoshiro; Ishii, Naohiro

doi:10.1007/978-3-642-32172-6_7

Tsuyoshi Miyazaki²,
Toyoshiro Nakashima³ &
Naohiro Ishii⁴

Part of the book series: Studies in Computational Intelligence ((SCI,volume 443))

805 Accesses

Abstract

In this paper, we describe an improvement of the method that detects distinctive mouth shapes from Japanese utterance image sequence. Previously, we proposed a detection method of the mouth shapes by using template matching. Two kinds of mouth shapes are formed when we pronounce a Japanese phone. One is a mouth shape that is formed at the beginning of utterance, and the other is formed at the end. The former is called “Beginning Mouth Shape” (BeMS) and the latter is “End Mouth Shape” (EMS). The proposed method was able to detect the mouth shapes. However, the method misdetected in some cases, because the term in which BeMS was formed was short. Therefore we considered that a high-speed camera was able to capture BeMS. According to some experiments, it was able to capture BeMS but another problem occurred. A deformed mouth shape that was changing to another was detected as BeMS. To prevent detecting the mouth shapes, optical flow is adopted. The term in which a mouth is deforming is detected by using optical flow and the mouth shape in the term prevents detecting.We propose a detection method of BeMS and EMS in the Japanese utterance image sequence by using template matching and optical flow.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Farnebäck, G.: Two-Frame Motion Estimation Based on Polynomial Expansion. In: Proceedings of the 13th Scandinavian Conference on Image Analysis, pp. 363–370 (2003)
Google Scholar
Kiyota, K., Uchimura, K.: An Utered Word Recognition Using Lip Image Information. The Transactions of the Institute of Electronics, Information and Communication Engineers J76-D-II(3), 812–814 (1993) (in Japanese)
Google Scholar
Miyazaki, T., Nakashima, T., Ishii, N.: A Detection Method of Basic Mouth Shapes from Japanese Utterance Images. In: Jacko, J.A. (ed.) HCII 2011, Part I. LNCS, vol. 6761, pp. 608–617. Springer, Heidelberg (2011)
Chapter Google Scholar
Miyazaki, T., Nakashima, T., Ishii, N.: A Proposal of Mouth Shapes Sequence Code for Japanese Pronunciation. In: Lee, R. (ed.) Software Engineering, Artificial Intelligence, NPD 2011. SCI, vol. 368, pp. 55–65. Springer, Heidelberg (2011)
Chapter Google Scholar
Nakata, Y., Ando, M.: Lipreading Method Using Color Extraction Method and Eigenspace Technique. The Transactions of the Institute of Electronics, Information and Communication Engineers J85-D-II(12), 1813–1822 (2002) (in Japanese)
Google Scholar
Okumura, A., Hamaguchi, Y., Okano, K., Miyazaki, T.: Speech Recognition Based on Integration of Visual and Auditory Information. Transactions of Information Processing Society of Japan 39(12), 3232–3241 (1998) (in Japanese)
Google Scholar
Otsu, N.: An Automatic Threshold Selection Method Based on Discriminant and Least Squares Criteria. The Transactions of the Institute of Electronics, Information and Communication Engineers J63-D(4), 349–356 (1980) (in Japanese)
Google Scholar
Saitoh, T., Konishi, R.: Lip Reading Based on Trajectory Feature. The IEICE Transactions on Information and Systems (Japanese edition) J90-D(4), 1105–1114 (2007) (in Japanese)
Google Scholar
Uda, K., Tagawa, N., Minagawa, A., Moriya, T.: Effectiveness Evaluation of Word Characteristics Obtained from 3-D Image Information for Lipreading. In: Proceedings of the 11th International Conference on Image Analysis and Processing (ICIAP 2001), pp. 296–301 (2001)
Google Scholar
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 511–518 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Kanagawa Institute of Technology, 1030 Shimo-ogino, Atsugi, Kanagawa, Japan
Tsuyoshi Miyazaki
Sugiyama Jogakuen University, 17-3 Hoshigaoka-motomachi, Chikusa, Nagoya, Aichi, Japan
Toyoshiro Nakashima
Aichi Institute of Technology, 1247 Yachigusa, Yakusa, Toyota, Aichi, Japan
Naohiro Ishii

Authors

Tsuyoshi Miyazaki
View author publications
You can also search for this author in PubMed Google Scholar
Toyoshiro Nakashima
View author publications
You can also search for this author in PubMed Google Scholar
Naohiro Ishii
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tsuyoshi Miyazaki .

Editor information

Editors and Affiliations

Software Engineering & Information, Technology Institute, Central Michigan University, Mt. Pleasant, 48859, USA
Roger Lee

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Miyazaki, T., Nakashima, T., Ishii, N. (2013). An Improvement of Basic Mouth Shape Detection Rate from Japanese Utterance Image Sequence Using Optical Flow. In: Lee, R. (eds) Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing 2012. Studies in Computational Intelligence, vol 443. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32172-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-32172-6_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32171-9
Online ISBN: 978-3-642-32172-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics