Skip to main content

An Improvement of Basic Mouth Shape Detection Rate from Japanese Utterance Image Sequence Using Optical Flow

  • Conference paper
Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing 2012

Part of the book series: Studies in Computational Intelligence ((SCI,volume 443))

  • 805 Accesses

Abstract

In this paper, we describe an improvement of the method that detects distinctive mouth shapes from Japanese utterance image sequence. Previously, we proposed a detection method of the mouth shapes by using template matching. Two kinds of mouth shapes are formed when we pronounce a Japanese phone. One is a mouth shape that is formed at the beginning of utterance, and the other is formed at the end. The former is called “Beginning Mouth Shape” (BeMS) and the latter is “End Mouth Shape” (EMS). The proposed method was able to detect the mouth shapes. However, the method misdetected in some cases, because the term in which BeMS was formed was short. Therefore we considered that a high-speed camera was able to capture BeMS. According to some experiments, it was able to capture BeMS but another problem occurred. A deformed mouth shape that was changing to another was detected as BeMS. To prevent detecting the mouth shapes, optical flow is adopted. The term in which a mouth is deforming is detected by using optical flow and the mouth shape in the term prevents detecting.We propose a detection method of BeMS and EMS in the Japanese utterance image sequence by using template matching and optical flow.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Farnebäck, G.: Two-Frame Motion Estimation Based on Polynomial Expansion. In: Proceedings of the 13th Scandinavian Conference on Image Analysis, pp. 363–370 (2003)

    Google Scholar 

  2. Kiyota, K., Uchimura, K.: An Utered Word Recognition Using Lip Image Information. The Transactions of the Institute of Electronics, Information and Communication Engineers J76-D-II(3), 812–814 (1993) (in Japanese)

    Google Scholar 

  3. Miyazaki, T., Nakashima, T., Ishii, N.: A Detection Method of Basic Mouth Shapes from Japanese Utterance Images. In: Jacko, J.A. (ed.) HCII 2011, Part I. LNCS, vol. 6761, pp. 608–617. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  4. Miyazaki, T., Nakashima, T., Ishii, N.: A Proposal of Mouth Shapes Sequence Code for Japanese Pronunciation. In: Lee, R. (ed.) Software Engineering, Artificial Intelligence, NPD 2011. SCI, vol. 368, pp. 55–65. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  5. Nakata, Y., Ando, M.: Lipreading Method Using Color Extraction Method and Eigenspace Technique. The Transactions of the Institute of Electronics, Information and Communication Engineers J85-D-II(12), 1813–1822 (2002) (in Japanese)

    Google Scholar 

  6. Okumura, A., Hamaguchi, Y., Okano, K., Miyazaki, T.: Speech Recognition Based on Integration of Visual and Auditory Information. Transactions of Information Processing Society of Japan 39(12), 3232–3241 (1998) (in Japanese)

    Google Scholar 

  7. Otsu, N.: An Automatic Threshold Selection Method Based on Discriminant and Least Squares Criteria. The Transactions of the Institute of Electronics, Information and Communication Engineers J63-D(4), 349–356 (1980) (in Japanese)

    Google Scholar 

  8. Saitoh, T., Konishi, R.: Lip Reading Based on Trajectory Feature. The IEICE Transactions on Information and Systems (Japanese edition) J90-D(4), 1105–1114 (2007) (in Japanese)

    Google Scholar 

  9. Uda, K., Tagawa, N., Minagawa, A., Moriya, T.: Effectiveness Evaluation of Word Characteristics Obtained from 3-D Image Information for Lipreading. In: Proceedings of the 11th International Conference on Image Analysis and Processing (ICIAP 2001), pp. 296–301 (2001)

    Google Scholar 

  10. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 511–518 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tsuyoshi Miyazaki .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Miyazaki, T., Nakashima, T., Ishii, N. (2013). An Improvement of Basic Mouth Shape Detection Rate from Japanese Utterance Image Sequence Using Optical Flow. In: Lee, R. (eds) Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing 2012. Studies in Computational Intelligence, vol 443. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32172-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32172-6_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32171-9

  • Online ISBN: 978-3-642-32172-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics