Multi-modal Correlation Modeling and Ranking for Retrieval

  • Hong Zhang
  • Fanlian Meng
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5879)


Correlation measure is a new hot topic in multimedia retrieval compared to distance metric like Euclidean and Mahalanobis distances. However, most correlation learning algorithms focused on multimedia data of single modality. For heterogeneous multi-modal data of different modalities correlation learning is more complicated. In this paper, we analyze multi-modal correlation among text, image and audio to understand underlying semantics for multi-modal retrieval. First, Kernel Canonical Correlation is used to build a kernel space where global inter-media correlation is analyzed; based on local geometrical topology in the kernel space a weighted graph and corresponding affinity matrix are formed for data and correlation representation; then correlation ranking is used to generate retrieval results; we also provide active learning strategies in relevance feedback to improve retrieval results. Experiment and comparison results are encouraging and show that the performance of our approach is effective.


Multi-modal Kernel CCA Correlation Ranking Active Learning 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Zhang, R., Zhang, Z.(M.): Effective Image Retrieval Based on Hidden Concept Discovery in Image Database. IEEE Transactions on Image Processing 16(2), 562–572 (2006)CrossRefGoogle Scholar
  2. 2.
    Zhao, X., Zhuang, Y., Wu, F.: Audio Clip Retrieval with Fast Relevance Feedback based on Constrained Fuzzy Clustering and Stored Index Table. In: Chen, Y.-C., Chang, L.-W., Hsu, C.-T. (eds.) PCM 2002. LNCS, vol. 2532, pp. 237–244. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  3. 3.
    Fan, J., Elmagarmid, A.K., Zhu, X.q., Aref, W.G., Wu, L.: ClassView: Hierarchical Video Shot Classification, Indexing, and Accessing. IEEE Transactions on Multimedia 6(1), 70–86 (2004)CrossRefGoogle Scholar
  4. 4.
    Zhuang, Y., Yi, Y., Fei, W.: Mining Semantic Correlation of Heterogeneous Multimedia Data for Cross-Media Retrieval. IEEE Transactions on Multimedia 10(2), 221–229 (2008)CrossRefGoogle Scholar
  5. 5.
    Wu, F., Zhang, H., Zhuang, Y.: Learning Semantic Correlations for Cross-media Retrieval. In: The 13th International Conference on Image Processing, pp. 1465–1468 (2006)Google Scholar
  6. 6.
    Lew, M., Sebe, N., Djeraba, C., Jain, R.: Content-based Mul-timedia Information Retrieval: State-of-the-art and Challenges. ACM Transactions on Multimedia Computing, Communication, and Applications 2(1), 1–19 (2006)CrossRefGoogle Scholar
  7. 7.
    Ma, Y., Lao, S., Takikawa, E., Kawade, M.: Discriminant Analysis in Correlation Similarity Measure Space. In: The 24th International Conference on Machine Learning, pp. 577–584 (2007)Google Scholar
  8. 8.
    Peterson, M.R., Doom, T.E., Raymer, M.L.: Facilitated KNN Classifier Optimization with Varying Similarity Measures. In: IEEE Congress on Evolutionary Computation, pp. 2514–2521 (2005)Google Scholar
  9. 9.
    Xie, C.Y., Savvides, M., Kumar, B.V.: Redundant Class-dependence Feature Analysis based on Correlation Filters using FRGC2.0 Data. In: Proceedings of the Computer Vision and Pattern Recognition, vol. 3, pp. 153–153 (2005)Google Scholar
  10. 10.
    Zhang, H., Guangweng, J.: Measuring Multi-modality Similarities via Subspace Learning for Cross-media Retrieval. In: Zhuang, Y.-t., Yang, S.-Q., Rui, Y., He, Q. (eds.) PCM 2006. LNCS, vol. 4261, pp. 979–988. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  11. 11.
    Zhang, H., Wang, Y.-y., Pan, H., Wu, F.: Understanding Visual-Auditory Correlation from Heterogeneous Features for Cross-media Retrieval. Journal of Zhejiang University Science-A 9, 241–249 (2008)MATHCrossRefGoogle Scholar
  12. 12.
    Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical Correlation Analysis: An Overview with Application to Learning Methods. Neural Computing 6, 2639–2664 (2004)CrossRefGoogle Scholar
  13. 13.
    Melzer, T., Reiter, M., Bischof, H.: Appearance Models based on Kernel Canonical Correlation Analysis. Pattern Recognition 36, 1961–1971 (2003)MATHCrossRefGoogle Scholar
  14. 14.
    Zhou, D., Bousquet, O., Lal, T.N., et al.: Learning with Local and Global Consistency. In: Conference on Neural Information Processing Systems, NIPS (2003)Google Scholar
  15. 15.
    Zhou, D., Bousquet, O., Lal, T.N., et al.: Ranking on Data Manifolds. In: Conference on Neural Information Processing Systems (NIPS) (2003)Google Scholar
  16. 16.
    He, J., Li, M., Zhang, H.J., Tong, H., Zhang, C.: Manifold-ranking Based Image Retrieval. In: ACM Multimedia Conference (2004)Google Scholar
  17. 17.
    Kokare, M., Chatterji, B.N., Biswas, P.K.: Comparison of Similarity Metrics for texture Image Retrieval. In: IEEE Conf. on Convergent Technologies for Asia-Pacific Region, vol. 2, pp. 571–575 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Hong Zhang
    • 1
  • Fanlian Meng
    • 1
  1. 1.College of Computer Science & TechnologyWuhan University of Science & TechnologyChina

Personalised recommendations