Skip to main content

Improving Cluster Selection and Event Modeling in Unsupervised Mining for Automatic Audiovisual Video Structuring

  • Conference paper
Advances in Multimedia Modeling (MMM 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7131))

Included in the following conference series:

Abstract

Can we discover audio-visually consistent events from videos in a totally unsupervised manner? And, how to mine videos with different genres? In this paper we present our new results in automatically discovering audio-visual events. A new measure is proposed to select audio-visually consistent elements from the two dendrograms respectively representing hierarchical clustering results for the audio and visual modalities. Each selected element corresponds to a candidate event. In order to construct a model for each event, each candidate event is represented as a group of clusters, and a voting mechanism is applied to select training examples for discriminative classifiers. Finally, the trained model is tested on the entire video to select video segments that belong to the event discovered. Experimental results on different and challenging genres of videos, show the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ben, M., Gravier, G.: Unsupervised mining of audiovisually consistent segments in videos with application to structure analysis. In: IEEE International Conference on Multimedia and Exhibition ICME 2011, Barcelona, Spain (July 2011)

    Google Scholar 

  2. Naphade, M., Li, C., Huang, T.: Discovering Recurrent Events in Multichannel Data Streams Using Unsupervised Methods. In: Data Mining: Next Generation Challenges and Future Directions. AAAI Press (2004)

    Google Scholar 

  3. Hauptmann, A., Baron, R.V., Chen, M.Y., Christel, M., Duygulu, P., Huang, C., Jin, R., Lin, W.H., Ng, T., Moraveji, N., Snoek, C.G.M., Tzanetakis, G., Yang, J., Yan, R., Wactlar, H.D.: Analyzing and searching broadcast news video. In: Proc. of TRECVID (2003)

    Google Scholar 

  4. Tat-Seng, C., Shih-Fu, C., Lekha, C., Winston, H.: Story boundary detection in large broadcast news video archives: techniques, experience and trends. In: Proceedings of the 12th ACM International Conference on Multimedia (2004)

    Google Scholar 

  5. Clarkson, B., Pentland, A.: Unsupervised clustering of ambulatory audio and video. In: IEEE International Conference on Proceedings of the Acoustics, Speech, and Signal Processing, vol. 6, pp. 3037–3040 (1999)

    Google Scholar 

  6. Xie, L., Chang, S., Divakaran, A., Sun, H.: Unsupervised Mining of Statistical Temporal Structures. In: Rosenfeld, A., et al. (eds.) Video Mining, ch.10. Kluwer Academic Publishers (2003)

    Google Scholar 

  7. Petkovic, M., Mihajlovic, V., Jonker, W., Djordjevic-Kajan, S.: Multi-Modal Extraction of Highlights from TV Formula 1 Programs. In: Proceedings of the IEEE International Conference on Multimedia and Expo, ICME (2002)

    Google Scholar 

  8. Wang, F., Ma, Y.-F., Zhang, H.-J., Li, J.-T.: A Generic Framework for Semantic Sports Video Analysis Using Dynamic Bayesian Networks. In: International MultiMedia Modeling Conference, pp. 115–122 (2005)

    Google Scholar 

  9. Covell, M., Baluja, S., Fink, M.: Detecting Ads in Video Streams Using Acoustic and Visual Cues. IEEE Computer Magazine 19(12) (2006)

    Google Scholar 

  10. Herley, C.: ARGOS: automatically extracting repeating objects from multimedia streams. IEEE Transactions on Multimedia 8(1) (2006)

    Google Scholar 

  11. Jacobs, A.: Using Self-similarity Matrices for Structure Mining on News Video. In: Antoniou, G., Potamias, G., Spyropoulos, C., Plexousakis, D. (eds.) SETN 2006. LNCS (LNAI), vol. 3955, pp. 87–94. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  12. Yang, X.-F., Tian, Q., Xue, P.: Efficient Short Video Repeat Identification With Application to News Video Structure Analysis. IEEE Transactions on Multimedia 9(3), 600–609 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ta, AP., Ben, M., Gravier, G. (2012). Improving Cluster Selection and Event Modeling in Unsupervised Mining for Automatic Audiovisual Video Structuring. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, CW., Andreopoulos, Y., Breiteneder, C. (eds) Advances in Multimedia Modeling. MMM 2012. Lecture Notes in Computer Science, vol 7131. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27355-1_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-27355-1_49

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-27354-4

  • Online ISBN: 978-3-642-27355-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics