Abstract
Indexing of visual media based on content analysis has now moved beyond using individual concept detectors and there is now a focus on combining concepts or post-processing the outputs of individual concept detection. Due to the limitations and availability of training corpora which are usually sparsely and imprecisely labeled, training-based refinement methods for semantic indexing of visual media suffer in correctly capturing relationships between concepts, including co-occurrence and ontological relationships. In contrast to training-dependent methods which dominate this field, this paper presents a training-free refinement (TFR) algorithm for enhancing semantic indexing of visual media based purely on concept detection results, making the refinement of initial concept detections based on semantic enhancement, practical and flexible. This is achieved using global and temporal neighbourhood information inferred from the original concept detections in terms of weighted non-negative matrix factorization and neighbourhood-based graph propagation, respectively. Any available ontological concept relationships can also be integrated into this model as an additional source of external a priori knowledge. Experiments on two datasets demonstrate the efficacy of the proposed TFR solution.
Keywords
P. Wang—This work was part-funded by 973 Program under Grant No. 2011CB302206, National Natural Science Foundation of China under Grant No. 61272231, 61472204, 61502264, Beijing Key Laboratory of Networked Multimedia and by Science Foundation Ireland under grant SFI/12/RC/2289. We also thank Prof. Philip S. Yu for helpful discussions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aly, R., Hiemstra, D., de Jong, F., Apers, P.: Simulating the future of concept-based video retrieval under improved detector performance. Multimedia Tools Appl. 60(1), 203–231 (2012)
Jiang, W., Chang, S.-F., Loui, A.: Context-based concept fusion with boosted conditional random fields. In: ICASSP, p. I-949 (2007)
Jiang, Y.-G., Dai, Q., Wang, J., Ngo, C.-W., Xue, X., Chang, S.-F.: Fast semantic diffusion for large-scale context-based image and video annotation. IEEE Trans. Image Proc. 21(6), 3080–3091 (2012)
Jiang, Y.-G., Wang, J., Chang, S.-F., Ngo, C.-W.: Domain adaptive semantic diffusion for large scale context-based video annotation. In: ICCV, pp. 1420–1427 (2009)
Jiang, Y.-G., Ngo, C.-W., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval (CIVR), pp. 494–501. ACM (2007)
Jin, Y., Khan, L., Wang, L., Awad, M.: Image annotations by combining multiple evidence & WordNet. In: ACM Multimedia, pp. 706–715 (2005)
Kennedy, L.S., Chang, S.-F.: A reranking approach for context-based concept fusion in video indexing and retrieval. In: CIVR, pp. 333–340. ACM (2007)
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: NIPS, pp. 556–562. MIT Press, April 2001
Li, B., Goh, K., Chang, E.Y.: Confidence-based dynamic ensemble for image annotation and semantics discovery. In: ACM Multimedia, pp. 195–206 (2003)
Naphade, M., Smith, J.R., Tesic, J., Chang, S.-F., Hsu, W., Kennedy, L., Hauptmann, A., Curtis, J.: Large-scale concept ontology for multimedia. IEEE Multimedia 13(3), 86–91 (2006)
Qi, G.-J., Hua, X.-S., Rui, Y., Tang, J., Mei, T., Zhang, H.-J.: Correlative multi-label video annotation. In: ACM Multimedia, pp. 17–26 (2007)
Smeaton, A., Over, P., Kraaij, W.: High level feature detection from video in TRECVid: a 5-year retrospective of achievements. In: Divakaran, A. (ed.) Multimedia Content Analysis, Theory and Applications, pp. 151–174 (2008)
Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and TRECVid. In: Proceedings of the ACM International Workshop on Multimedia Information Retrieval, pp. 321–330. ACM (2006)
Snoek, C.G.M., Worring, M.: Concept-based video retrieval. Found. Trends Inf. Retrieval 2(4), 215–322 (2008)
Wang, C., Jing, F., Zhang, L., Zhang, H.-J.: Image annotation refinement using random walk with restarts. In: ACM Multimedia, pp. 647–650 (2006)
Wang, C., Jing, F., Zhang, L., Zhang, H.-J.: Content-based image annotation refinement. In: CVPR, pp. 1–8 (2007)
Wang, P., Smeaton, A.F., Gurrin, C.: Factorizing time-aware multi-way tensors for enhancing semantic wearable sensing. In: He, X., Luo, S., Tao, D., Xu, C., Yang, J., Hasan, M.A. (eds.) MMM 2015, Part I. LNCS, vol. 8935, pp. 571–582. Springer, Heidelberg (2015)
Wu, Y., Tseng, B., Smith, J.: Ontology-based multi-classification learning for video concept detection. In: ICME, vol. 2, pp. 1003–1006 (2004)
Xu, D., Cui, P., Zhu, W., Yang, S.: Find you from your friends: graph-based residence location prediction for users in social media. In: ICME, pp. 1–6 (2014)
Xue, X., Zhang, W., Zhang, J., Wu, B., Fan, J., Lu, Y.: Correlative multi-label multi-instance image annotation. In: ICCV, pp. 651–658. IEEE (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, P., Sun, L., Yang, S., Smeaton, A.F. (2016). Towards Training-Free Refinement for Semantic Indexing of Visual Media. In: Tian, Q., Sebe, N., Qi, GJ., Huet, B., Hong, R., Liu, X. (eds) MultiMedia Modeling. MMM 2016. Lecture Notes in Computer Science(), vol 9516. Springer, Cham. https://doi.org/10.1007/978-3-319-27671-7_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-27671-7_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27670-0
Online ISBN: 978-3-319-27671-7
eBook Packages: Computer ScienceComputer Science (R0)