Improving Cluster Selection and Event Modeling in Unsupervised Mining for Automatic Audiovisual Video Structuring

  • Anh-Phuong Ta
  • Mathieu Ben
  • Guillaume Gravier
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7131)


Can we discover audio-visually consistent events from videos in a totally unsupervised manner? And, how to mine videos with different genres? In this paper we present our new results in automatically discovering audio-visual events. A new measure is proposed to select audio-visually consistent elements from the two dendrograms respectively representing hierarchical clustering results for the audio and visual modalities. Each selected element corresponds to a candidate event. In order to construct a model for each event, each candidate event is represented as a group of clusters, and a voting mechanism is applied to select training examples for discriminative classifiers. Finally, the trained model is tested on the entire video to select video segments that belong to the event discovered. Experimental results on different and challenging genres of videos, show the effectiveness of our approach.


Video mining Video structuring Multimodality Mutual Information Event discovery Structural event Audiovisual consistency 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ben, M., Gravier, G.: Unsupervised mining of audiovisually consistent segments in videos with application to structure analysis. In: IEEE International Conference on Multimedia and Exhibition ICME 2011, Barcelona, Spain (July 2011)Google Scholar
  2. 2.
    Naphade, M., Li, C., Huang, T.: Discovering Recurrent Events in Multichannel Data Streams Using Unsupervised Methods. In: Data Mining: Next Generation Challenges and Future Directions. AAAI Press (2004)Google Scholar
  3. 3.
    Hauptmann, A., Baron, R.V., Chen, M.Y., Christel, M., Duygulu, P., Huang, C., Jin, R., Lin, W.H., Ng, T., Moraveji, N., Snoek, C.G.M., Tzanetakis, G., Yang, J., Yan, R., Wactlar, H.D.: Analyzing and searching broadcast news video. In: Proc. of TRECVID (2003)Google Scholar
  4. 4.
    Tat-Seng, C., Shih-Fu, C., Lekha, C., Winston, H.: Story boundary detection in large broadcast news video archives: techniques, experience and trends. In: Proceedings of the 12th ACM International Conference on Multimedia (2004)Google Scholar
  5. 5.
    Clarkson, B., Pentland, A.: Unsupervised clustering of ambulatory audio and video. In: IEEE International Conference on Proceedings of the Acoustics, Speech, and Signal Processing, vol. 6, pp. 3037–3040 (1999)Google Scholar
  6. 6.
    Xie, L., Chang, S., Divakaran, A., Sun, H.: Unsupervised Mining of Statistical Temporal Structures. In: Rosenfeld, A., et al. (eds.) Video Mining, ch.10. Kluwer Academic Publishers (2003)Google Scholar
  7. 7.
    Petkovic, M., Mihajlovic, V., Jonker, W., Djordjevic-Kajan, S.: Multi-Modal Extraction of Highlights from TV Formula 1 Programs. In: Proceedings of the IEEE International Conference on Multimedia and Expo, ICME (2002)Google Scholar
  8. 8.
    Wang, F., Ma, Y.-F., Zhang, H.-J., Li, J.-T.: A Generic Framework for Semantic Sports Video Analysis Using Dynamic Bayesian Networks. In: International MultiMedia Modeling Conference, pp. 115–122 (2005)Google Scholar
  9. 9.
    Covell, M., Baluja, S., Fink, M.: Detecting Ads in Video Streams Using Acoustic and Visual Cues. IEEE Computer Magazine 19(12) (2006)Google Scholar
  10. 10.
    Herley, C.: ARGOS: automatically extracting repeating objects from multimedia streams. IEEE Transactions on Multimedia 8(1) (2006)Google Scholar
  11. 11.
    Jacobs, A.: Using Self-similarity Matrices for Structure Mining on News Video. In: Antoniou, G., Potamias, G., Spyropoulos, C., Plexousakis, D. (eds.) SETN 2006. LNCS (LNAI), vol. 3955, pp. 87–94. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  12. 12.
    Yang, X.-F., Tian, Q., Xue, P.: Efficient Short Video Repeat Identification With Application to News Video Structure Analysis. IEEE Transactions on Multimedia 9(3), 600–609 (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Anh-Phuong Ta
    • 1
  • Mathieu Ben
    • 2
  • Guillaume Gravier
    • 3
  1. 1.INRIA-RennesRennes, CedexFrance
  2. 2.PowediaRennes, CedexFrance
  3. 3.CNRS-IRISARennes, CedexFrance

Personalised recommendations