Content Structure Discovery in Educational Videos Using Shared Structures in the Hierarchical Hidden Markov Models

  • Dinh Q. Phung
  • Hung H. Bui
  • Svetha Venkatesh
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3138)


In this paper, we present an application of the hierarchical hmm for structure discovery in educational videos. The hhmm has recently been extended to accommodate the concept of shared structure, ie: a state might multiply inherit from more than one parents. Utilising the expressiveness of this model, we concentrate on a specific class of video – educational videos – in which the hierarchy of semantic units is simpler and clearly defined in terms of topics and its sub-units. We model the hierarchy of topical structures by an hhmm and demonstrate the usefulness of the model in detecting topic transitions.


Hide Markov Model Semantic Concept Educational Video Broadcast News Shared Structure 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Ariki, Y., Shibutani, A., Sugiyama, Y.: Classification and retrieval of TV Sports News by DCT features. In: IPSJ International Symposium on Information System and Technologies for Network Society, pp. 269–272 (1997)Google Scholar
  2. 2.
    Shearer, K., Dorai, C., Venkatesh, S.: Incorporating domain knowlege with video and voice data analysis. In: MDM/KDD 2000, Workshop on Multimedia Data Minning, Boston, USA, Auguest 20-23 (2000)Google Scholar
  3. 3.
    Bertini, M., Bimbo, A.D., Pala, P.: Content based annotation and retrieval of news videos. In: International Conference on Multimedia and Expo., pp. 479–482 (2000)Google Scholar
  4. 4.
    Eickeler, S., Müller, S.: Content-based video indexing of TV broadcast news using Hidden Markov Model. In: Proceedings of IEEE International on Acoustics Speech and Signal Processing, Phoenix, vol. 6 (1999)Google Scholar
  5. 5.
    Huang, Q., Liu, Z., Rosenberg, A.: Automated semantic structure reconstruction and representation generation for broadcast news. In: Proc. IS&T/SPIE Conference on Storage and Retrieval for Image and Video Databases VII, vol. 3656, pp. 50–62 (1999)Google Scholar
  6. 6.
    Liu, Z., Huang, J., Wang, Y.: Classification of TV programs based on audio information using hidden markov model. In: IEEE Signal Processing Society 1998 Workshop on Multimedia Signal Processing, pp. 27–32 (1998)Google Scholar
  7. 7.
    Liu, Z., Huang, Q.: Detecting news reporting using audio/visual information. In: International Conference on Image Processing, Kobe, Japan, pp. 24–28 (1999)Google Scholar
  8. 8.
    Walls, F., Jin, H., Sista, S., Schwartz, R.: Topic detection in broadcast news. In: Proceedings of the DARPR Broadcast News Workshop, pp. 193–198 (1999)Google Scholar
  9. 9.
    Seyeda-Mahmood, T., Srinivasan, S.: Detecting topical events in digital video. ACM Multimedia, 85–94 (2000)Google Scholar
  10. 10.
    Adams, B., Dorai, C., Venkatesh, S.: Novel approach to determining movie tempo and dramatic story sections in motion pictures. In: 2000 International Conference on Image Processing, Vancouver, Canada, vol. II, pp. 283–286 (2000)Google Scholar
  11. 11.
    Adams, B., Dorai, C., Venkatesh, S.: Role of shot length in characterizing tempo and dramatic story sections in motion pictures. In: IEEE Pacific Rim Conference on Multimedia 2000, Sydney, Australia, pp. 54–57 (2000)Google Scholar
  12. 12.
    Wang, J., Chua, T.S., Chen, L.: Cinematic-based model for scene boundary detection. In: The Eighth Conference on Multimedia Modeling, Amsterdam, Netherland (2001)Google Scholar
  13. 13.
    Sundaram, H., Chang, S.F.: Video scene segmentation using audio and video features. In: International Conference on Multimedia and Expo. (2000)Google Scholar
  14. 14.
    Fine, S., Singer, Y., Tishby, N.: The hierarchical hidden markov model: Analysis and applications. Machine Learning 32, 41–62 (1998)zbMATHCrossRefGoogle Scholar
  15. 15.
    Murphy, K., Paskin, M.: Linear time inference in hierarhical hidden markov models. In: Proceedings of Neural Information Processing Systems, Vancouver, Canada (2001) Google Scholar
  16. 16.
    Xie, L., Chang, S.F., Divakaran, A., Sun, H.: Unsupervised discovery of multilevel statistical video structures using hierarhical hidden markov models. In: IEEE International on Multimedia and Expo, Baltimore, USA ,pp.III.29 – III.32 (2003)Google Scholar
  17. 17.
    Xie, L., Chang, S.F., Divakaran, A., Sun, H.: Learning hierarhical hidden markov models for unsupervised structure discovery from video. Technical report, Columbia University, New York (2002)Google Scholar
  18. 18.
    Theocharous, G., Mahadevan, S.: Learning the hierarchical structure of spatial environments using multiresolution statistical models. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS (2002)Google Scholar
  19. 19.
    Luhr, S., Bui, H.H., Venkatesh, S., West, G.: Recognition of human activity through hierarchical stochastic learning. In: International Conference on Pervasive Computing and Communication (PERCOM 2003) (2003) Google Scholar
  20. 20.
    Skounakis, M., Craven, M., Ray, S.: Hierarchical hidden markov models for information extraction. In: Proceedings of the Eighteen International Joint Conference on Artificial Intelligence (IJCAI 2003) (2003)Google Scholar
  21. 21.
    Bui, H.H., Phung, D.Q., Venkatesh, S.: Hierarchical hidden markov models with general state hierarchy. In: The Nineteenth National Conference on Artificial Intelligence (AAAI 2004), San Jose, California USA (2004) (to appear)Google Scholar
  22. 22.
    Herman, L.: Educational Films: Writing, Directing, and Producing for Classroom, Television, and Industry. Crown Publishers, INC, New York (1965)Google Scholar
  23. 23.
    Phung, D.Q., Venkatesh, S., Dorai, C.: On extraction of thematic and dramatic functions in educational films. In: IEEE International Conference on Multimedia and Expo, Baltimore, New York, USA, pp. 449–452 (2003)Google Scholar
  24. 24.
    Phung, D.Q., Dorai, C., Venkatesh, S.: High level segmentation of instructional videos based on the content density function. In: ACM International Conference on Multimedia, Juan Les Pins, France, pp. 295–298 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Dinh Q. Phung
    • 1
  • Hung H. Bui
    • 2
  • Svetha Venkatesh
    • 1
  1. 1.Department of ComputingCurtin University of TechnologyPerth
  2. 2.Artificial Intelligence CenterSRI InternationalMenlo ParkUSA

Personalised recommendations