Skip to main content

Video Affective Content Representation and Recognition Using Video Affective Tree and Hidden Markov Models

  • Conference paper
Affective Computing and Intelligent Interaction (ACII 2007)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4738))

Abstract

A video affective content representation and recognition framework based on Video Affective Tree (VAT) and Hidden Markov Models (HMMs) is presented. Video affective content units in different granularities are firstly located by excitement intensity curves, and then the selected affective content units are used to construct VAT. According to the excitement intensity curve the affective intensity of each affective content unit at different levels of VAT can also be quantified into several levels from weak to strong. Many middle-level audio and visual affective features, which represent emotional characteristics, are designed and extracted to construct observation vectors. Based on these observation vector sequences HMMs-based video affective content recognizers are trained and tested to recognize the basic emotional events of audience (joy, anger, sadness and fear). The experimental results show that the proposed framework is not only suitable for a broad range of video affective understanding applications, but also capable of representing affective semantics in different granularities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hanjalic, A.: Extracting Moods from Pictures and Sounds: Towards truly personalized TV. IEEE Signal Processing Magazine 3, 90–100 (2006)

    Article  Google Scholar 

  2. Hanjalic, A., Xu, L.-Q.: Affective video content representation and modeling. IEEE Trans. Multimedia 2, 143–154 (2005)

    Article  Google Scholar 

  3. Kang, H.-B.: Affective Content Detection using HMMs. In: Proceedings of the eleventh ACM international conference on Multimedia, November 2-8, pp. 259–262 (2003)

    Google Scholar 

  4. Murray, I.R., Arnott, J.L.: Implementation and testing of a system for producing emotion-by-rule in synthetic speech. Speech Communication 16, 369–390 (1995)

    Article  Google Scholar 

  5. Information Technology—Multimedia Content Description Interface—Part 4: Audio, ISO/IEC CD 15938-4 (2001)

    Google Scholar 

  6. Ngo, C.W., Pong, T.C., Chin, R.T.: Video partitioning by temporal slice coherency. IEEE Trans. Circuits Syst. Video Technol., 11(8), 941–953 (2001)

    Article  Google Scholar 

  7. Ngo, C.W., Pong, T.C., Zhang, H.J.: Motion-based video representation for scene change detection. Int. J. Comput. Vis., 50(2), 11 (2002)

    Article  Google Scholar 

  8. Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 256–286 (1989)

    Article  Google Scholar 

  9. Junqing, Y., Yunfeng, H., Sun, K., Zhifang, W., Xiangmei, W.: Semantic Analysis and Retrieval of Sports Video. In: Proceeding of Japan-China Joint Workshop on Frontier of Computer Science and Technology, Aizu-Wakamatsu, pp. 97–108 (2006)

    Google Scholar 

  10. Kai Sun, Yu Junqing, Wang Ning: Shot Boundary Detection and Key-frames Extraction Based on MPEG-7 Visual Descriptors. In: Proceeding of 3rd Conference on Intelligent CAD and Digital Entertainment (November 2006)

    Google Scholar 

  11. Goldstein, E.: Sensation and Perception. Brooks/Cole (1999)

    Google Scholar 

  12. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  13. Boreczky, J., Wilcox, E.: A Hidden Markov Model Framework for Video Segmentation Using Audio and Image Features. In: Proc. ICASSP 1998 (1998)

    Google Scholar 

  14. Eickeler, S., Muller, S.: Content-based Video Indexing of TV Broadcast News Using Hidden Markov Models. In: Proc. ICASSP 1999 (1999)

    Google Scholar 

  15. Naphade, M., Garg, A., Huang, T.: Audio-Visual Event Detection using Duration dependent input output Markov models. In: Proc. IEEE CBAIBL 2001, Kauai, HI (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ana C. R. Paiva Rui Prada Rosalind W. Picard

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sun, K., Yu, J. (2007). Video Affective Content Representation and Recognition Using Video Affective Tree and Hidden Markov Models. In: Paiva, A.C.R., Prada, R., Picard, R.W. (eds) Affective Computing and Intelligent Interaction. ACII 2007. Lecture Notes in Computer Science, vol 4738. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74889-2_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74889-2_52

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74888-5

  • Online ISBN: 978-3-540-74889-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics