Hierarchical Hidden Markov Model for Rushes Structuring and Indexing

  • Chong-Wah Ngo
  • Zailiang Pan
  • Xiaoyong Wei
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4071)


Rushes footage are considered as cheap gold mine with the potential for reuse in broadcasting and filmmaking industries. However, it is difficult to mine the “gold” from the rushes since usually only minimum metadata is available. This paper focuses on the structuring and indexing of the rushes to facilitate mining and retrieval of “gold”. We present a new approach for rushes structuring and indexing based on motion feature. We model the problem by a two-level Hierarchical Hidden Markov Model (HHMM). The HHMM, on one hand, represents the semantic concepts in its higher level to provide simultaneous structuring and indexing, on the other hand, models the motion feature distributions in its lower level to support the encoding of the semantic concepts. The encouraging experimental results on TRECVID′05 BBC rushes demonstrate the effectiveness of our approach.


Support Vector Machine Motion Feature Finite State Machine Semantic Concept Observation Sequence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hauptmann, A.G.: Lessons for the future from a decade of informedia video analysis research. In: Leow, W.-K., Lew, M., Chua, T.-S., Ma, W.-Y., Chaisorn, L., Bakker, E.M. (eds.) CIVR 2005. LNCS, vol. 3568, pp. 1–10. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  2. 2.
  3. 3.
    Allen, B.P., Petrushin, V.A.: Searching for relevent video shots in bbc rushes using semantic web techniques. In: Proceedings of the TRECVID Workshops (2005)Google Scholar
  4. 4.
    Foley, C., et al.: Trecvid 2005 experiments at Dublin City University. In: Proceedings of the TRECVID Workshops (2005)Google Scholar
  5. 5.
    Snoek, C.G.M., et al.: The mediamill trecvid 2005 semantic video search engine. In: Proceedings of the TRECVID Workshops (2005)Google Scholar
  6. 6.
    Ngo, C.W., et al.: Motion driven approaches to shot boundary detection, low-level feature extraction and bbc rush characterization. In: Proceedings of the TRECVID Workshops (2005)Google Scholar
  7. 7.
    Fine, S., Singer, Y., Tishby, N.: The hierarchical hidden Markov model: Analysis and applications. Machine Learning 32, 41–62 (1998)MATHCrossRefGoogle Scholar
  8. 8.
    Xie, L., et al.: Learning hierarchical hidden Markov models for video structure discovery. Technical report, Columbia University (2002)Google Scholar
  9. 9.
    Murphy, K., Paskin, M.: Linear time inference in hierarchical HMMs. In: Proceedings of Neural Information Processing Systems (2001)Google Scholar
  10. 10.
    Pan, Z., Ngo, C.W.: Structuring home video by snippet detection and pattern parsing. In: MIR 2004: Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval, pp. 69–76. ACM Press, New York (2004)CrossRefGoogle Scholar
  11. 11.
    Rousseeuw, P.J., Leroy, A.M.: Robust regression and outlier detection. Wiley, New York (1987)MATHCrossRefGoogle Scholar
  12. 12.
    Ngo, C.W., Pong, T.C., Chin, R.T.: Video partitioning by temporal slice coherency. IEEE Trans. Circuits Syst. Video Technol. 11, 941–953 (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Chong-Wah Ngo
    • 1
  • Zailiang Pan
    • 1
  • Xiaoyong Wei
    • 1
  1. 1.Department of Computer ScienceCity University of Hong KongKowloon, Hong Kong

Personalised recommendations