Explicit Performance Metric Optimization for Fusion-Based Video Retrieval

  • Ilseo Kim
  • Sangmin Oh
  • Byungki Byun
  • A. G. Amitha Perera
  • Chin-Hui Lee
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7585)


We present a learning framework for fusion-based video retrieval system, which explicitly optimizes given performance metrics. Real-world computer vision systems serve sophisticated user needs, and domain-specific performance metrics are used to monitor the success of such systems. However, the conventional approach for learning under such circumstances is to blindly minimize standard error rates and hope the targeted performance metrics improve, which is clearly suboptimal. In this work, a novel scheme to directly optimize such targeted performance metrics during learning is developed and presented. Our experimental results on two large consumer video archives are promising and showcase the benefits of the proposed approach.


Performance Metrics Video Retrieval Audio Feature Multiple Kernel Learning Multimedia Event Detection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Toderici, G., Aradhye, H., Pasca, M., Sbaiz, L., Yagnik, J.: Finding meaning on youtube: Tag recommendation and category discovery. In: CVPR (2010)Google Scholar
  2. 2.
    Jiang, Y.G., Ye, G., Chang, S.F., Ellis, D., Loui, A.C.: Consumer video understanding: A benchmark database and an evaluation of human and machine performance. In: ACM ICMR (2011)Google Scholar
  3. 3.
    Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and trecvid. In: ACM MIR (2006)Google Scholar
  4. 4.
    Wang, Z., Zhao, M., Song, Y., Kumar, S., Li, B.: Youtubecat: Learning to categorize wild web videos. In: CVPR (2010)Google Scholar
  5. 5.
    Yang, W., Toderici, G.: Discriminative tag learning on youtube videos with latent sub-tags. In: CVPR (2011)Google Scholar
  6. 6.
    Joachims, T.: A support vector method for multivariate performance measures. In: ICML (2005)Google Scholar
  7. 7.
    Calonder, M., Lepetit, V., Fua, P.: Pareto-optimal Dictionaries for Signatures. In: CVPR (2010)Google Scholar
  8. 8.
    Gao, S., Wu, W., Lee, C.H., Chua, T.S.: A mfom learning approach to robust multiclass multi-label text categorization. In: ICML (2004)Google Scholar
  9. 9.
    Varma, M., Ray, D.: Learning the discriminative power-invariance trade-off. In: ICCV (2007)Google Scholar
  10. 10.
    Gehler, P.V., Nowozin, S.: On feature combination for multiclass object classification. In: IEEE International Conference on Computer Vision, ICCV (2009)Google Scholar
  11. 11.
    Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos ïn the wild.̈ In: CVPR (2009)Google Scholar
  12. 12.
    Katagiri, S., Juang, B.H., Lee, C.H.: Pattern recognition using a family of design algorithm based upon the generalized probabilistic descent method. Proc. of the IEEE, 2345–2373 (1998)Google Scholar
  13. 13.
    Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)Google Scholar
  14. 14.
    Kläser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3D-gradients. In: BMVC (2008)Google Scholar
  15. 15.
    Li, L.J., Su, H., Xing, E.P., Fei-Fei, L.: Object bank: A high-level image representation for scene classification and semantic feature sparsification. In: Proceedings of the Neural Information Processing Systems, NIPS (2010)Google Scholar
  16. 16.
    Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope 42, 145–175 (2001)Google Scholar
  17. 17.
    Lee, C.H., Soong, F., Juan, B.H.: A segment model based approach to speech recognition. In: ICASSP (1988)Google Scholar
  18. 18.
    Martin, A.F., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The DET curve in assessment of detection task performance. In: Eurospeech (1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Ilseo Kim
    • 1
  • Sangmin Oh
    • 2
  • Byungki Byun
    • 3
  • A. G. Amitha Perera
    • 2
  • Chin-Hui Lee
    • 1
  1. 1.Georgia Institute of TechnologyUSA
  2. 2.Kitware Inc.USA
  3. 3.MicrosoftUSA

Personalised recommendations