Multimedia Systems

, Volume 23, Issue 1, pp 5–18 | Cite as

A discriminative graph inferring framework towards weakly supervised image parsing

  • Lei Yu
  • Bing-Kun Bao
  • Changsheng XuEmail author
Special Issue Paper


In this paper, we focus on the task of assigning labels to the over-segmented image patches in a weakly supervised manner, in which the training images contain the labels but do not have the labels’ locations in the images. We propose a unified discriminative graph inferring framework by simultaneously inferring patch labels and learning the patch appearance models. On one hand, graph inferring reasons the patch labels by a graph propagation procedure. The graph is constructed by connecting the nearest neighbors which share the same image label, and multiple correlations among patches and image labels are imposed as constraints to the inferring. On the other hand, for each label, the patches which do not contain the target label are adopted as negative samples to learn the appearance model. In this way, the predicted labels will be more accurate in the propagation. Graph inferring and the learned patch appearance models are finally embedded to complement each other in one unified formulation. Experiments on three public datasets demonstrate the effectiveness of our method in comparison with other baselines.


Image annotation Appearance model Label propagation Label localization Image parsing 



This work is supported by 973 Program (2012CB316304), National Natural Science Foundation of China (61201374, 61225009, 61432019), and Beijing Natural Science Foundation (4131004, 4152053).


  1. 1.
    He, X., Zemel, R.S., Carreira-Perpindn, M.: Multiscale conditional random fields for image labeling. In: CVPR, vol. 2, p. II-695. IEEE (2004)Google Scholar
  2. 2.
    Shotton, J., Winn, J., Rother, C., Criminisi, A.: Textonboost: joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: ECCV, pp. 1–15. Springer (2006)Google Scholar
  3. 3.
    Galleguillos, C., Rabinovich, A., Belongie, S.: Object categorization using co-occurrence, location and appearance. In: CVPR, pp. 1–8. IEEE (2008)Google Scholar
  4. 4.
    Tighe, J., Lazebnik, S.: Superparsing: scalable nonparametric image parsing with superpixels. In: ECCV, pp. 352–365. Springer (2010)Google Scholar
  5. 5.
    Liu, C., Yuen, J., Torralba, A.: Nonparametric scene parsing: label transfer via dense scene alignment. In: CVPR, pp. 1972–1979. IEEE (2009)Google Scholar
  6. 6.
    Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: ideas, influences, and trends of the new age. ACM Comput. Surv. (CSUR) 40(2), 5 (2008)CrossRefGoogle Scholar
  7. 7.
    Zhang, L., Song, M., Yang, Y., Zhao, Q., Zhao, C., Sebe, N.: Weakly supervised photo cropping. In: IEEE Transactions on Multimedia, pp. 94–107 (2014)Google Scholar
  8. 8.
    Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1958–1970 (2008)CrossRefGoogle Scholar
  9. 9.
    Verbeek, J., Triggs, B.: Region classification with markov field aspect models. In: CVPR, pp. 1–8. IEEE (2007)Google Scholar
  10. 10.
    Vezhnevets, A., Ferrari, V.; Buhmann, J.M.: Weakly supervised semantic segmentation with a multi-image model. In: ICCV, pp. 643–650. IEEE (2011)Google Scholar
  11. 11.
    Zhang, L., Yang, Y., Gao, Y., Yu, Y., Wang, C., Li, X.: A probabilistic associative model for segmenting weakly-supervised images. In: IEEE Transaction on Image Processing, pp. 4150–4159 (2014)Google Scholar
  12. 12.
    Liu, D., Hua, X.-S., Yang, L., Wang, M., Zhang, H.-J.: Tag ranking. In: WWW, pp. 351–360. ACM (2009)Google Scholar
  13. 13.
    Wang, C., Jing, F., Zhang, L., Zhang, H.-J.: Image annotation refinement using random walk with restarts. In: MM, pp. 647–650. ACM (2006)Google Scholar
  14. 14.
    Zhang, L., Gao, Y., Lu, K., Shen, J., Ji, R.: Representative discovery of structure cues for weakly-supervised image segmentation. In: IEEE Transactions on Multimedia, pp. 470–479 (2014)Google Scholar
  15. 15.
    Zhang, L., Han, Y., Yang, Y., Song, M., Yan, S., Tian, Q.: Discovering discriminative graphlets for aerial image categories recognition. IEEE Trans. Image Process. 22(12), 5071–5084 (2013)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Zhang, L., Gao, Y., Xia, Y., Dai, Q., Li, X.: A fine-grained image categorization system by cellet-encoded spatial pyramid modeling. In: IEEE Transactions on Industrial Electronics, pp. 564–571 (2014)Google Scholar
  17. 17.
    Zhang, L., Song, M., Liu, X., Sun, L., Chen, C., Bu, J.: Recognizing architecture styles by hierarchical sparse coding of blocklets. Inf. Sci. 254, 141–154 (2014)CrossRefGoogle Scholar
  18. 18.
    Zhang, L., Gao, Y., Hong, C., Feng, Y., Zhu, J., Cai, D.: Feature correlation hypergraph: exploiting high-order potentials for multimodal recognition. In: IEEE Transactions on Cybernetics, pp. 1408–1419 (2013)Google Scholar
  19. 19.
    Zhang, L., Song, M., Liu, X., Bu, J., Chen, C.: Fast multi-view segment graph kernel for object classification. Signal Process. 93(6), 1597–1607 (2013)CrossRefGoogle Scholar
  20. 20.
    Yuille, A., Rangarajan, A.: The concave–convex procedure. Neural Comput. 15(4), 915–936 (2003)CrossRefzbMATHGoogle Scholar
  21. 21.
    Liu, X., Yan, S., Yan, J., Jin, H.: Unified solution to nonnegative data factorization problems. In: ICDM, pp. 307–316. IEEE (2009)Google Scholar
  22. 22.
    Andrew, G., Gao, J.: Scalable training of l 1-regularized log-linear models. In: ICML, pp. 33–40. ACM (2007)Google Scholar
  23. 23.
    Rother, C., Minka, T., Blake, A., Kolmogorov, A.: Cosegmentation of image pairs by histogram matching-incorporating a global constraint into mrfs. In: CVPR, vol. 1, pp. 993–1000. IEEE (2006)Google Scholar
  24. 24.
    Joulin, A., Bach, F., Ponce, J.: Discriminative clustering for image co-segmentation. In: CVPR, pp. 1943–1950. IEEE (2010)Google Scholar
  25. 25.
    Joulin, A., Bach, F., Ponce, J.: Multi-class cosegmentation. In: CVPR, pp. 542–549. IEEE (2012)Google Scholar
  26. 26.
    Kim, G., Xing, E.P.: On multiple foreground cosegmentation. In: CVPR, pp. 837–844. IEEE (2012)Google Scholar
  27. 27.
    Kim, G., Xing, E.P., Fei-Fei, L., Kanade, T.: Distributed cosegmentation via submodular optimization on anisotropic diffusion. In: ICCV, pp. 169–176. IEEE (2011)Google Scholar
  28. 28.
    Vicente, S., Rother, C., Kolmogorov, V.: Object cosegmentation. In: CVPR, pp. 2217–2224. IEEE (2011)Google Scholar
  29. 29.
    Yang, C., Dong, M., Fotouhi, F.: Region based image annotation through multiple-instance learning. In: MM, pp. 435–438. ACM (2005)Google Scholar
  30. 30.
    Yang, C., Dong, M., Hua, J.: Region-based image annotation using asymmetrical support vector machine-based multiple-instance learning. In: CVPR, vol. 2, pp. 2057–2063. IEEE (2006)Google Scholar
  31. 31.
    Wang, M., Hong, R., Li, G., Zha, Z.-J., Yan, S., Chua, T.-S.: Event driven web video summarization by tag localization and key-shot identification. IEEE Trans. Multimed. 14(4), 975–985 (2012)CrossRefGoogle Scholar
  32. 32.
    Liu, X., Cheng, B., Yan, S., Tang, J., Chua, T., Jin, H.: Label to region by bi-layer sparsity priors. In: MM, pp. 115–124. ACM (2009)Google Scholar
  33. 33.
    Yang, Y., Yang, Y., Huang, Z., Shen, H., Nie, F.: Tag localization with spatial correlations and joint group sparsity. In: CVPR, pp. 881–888. IEEE (2011)Google Scholar
  34. 34.
    Liu, D., Yan, S., Rui, Y., Zhang, H.-J.: Unified tag analysis with multi-edge graph. In: MM, pp. 25–34. ACM (2010)Google Scholar
  35. 35.
    Liu, S., Yan, S., Zhang, T., Xu, C., Liu, J., Lu, H.: Weakly supervised graph propagation towards collective image parsing. IEEE Trans. Multimed. 14(2), 361–373 (2012)CrossRefGoogle Scholar
  36. 36.
    Yu, L., Liu, J., Xu, C.: Label localization by appearance guided graph inferring. In: ICIP, pp. 3456–3460. IEEE (2013)Google Scholar
  37. 37.
    Von Ahn, L., Dabbish, L.: Labeling images with a computer game. In: SIGCHI, pp. 319–326. ACM (2004)Google Scholar
  38. 38.
    Von Ahn, L., Liu, R., Blum, M.: Peekaboom: a game for locating objects in images. In: SIGCHI, pp. 55–64. ACM (2006)Google Scholar
  39. 39.
    Russell, B., Torralba, A., Murphy, K., Freeman, W.: Labelme: a database and web-based tool for image annotation. Int. J. Comput. Vis. 77(1), 157–173 (2008)CrossRefGoogle Scholar
  40. 40.
    Yang, K., Hua, X.-S., Wang, M., Zhang, H.-J.: Tag tagging: towards more descriptive keywords of image content. IEEE Trans. Multimed. 13(4), 662–673 (2011)CrossRefGoogle Scholar
  41. 41.
    Wang, M., Ni, B., Hua, X.-S., Chua, T.-S.: Assistive tagging: a survey of multimedia tagging with human–computer joint exploration. ACM Comput. Surv. (CSUR) 44(4), 25 (2012)CrossRefGoogle Scholar
  42. 42.
    Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR, pp. 248–255. IEEE (2009)Google Scholar
  43. 43.
    Bao, B.-K., Li, T., Yan, S.: Hidden-concept driven multilabel image annotation and label ranking. IEEE Trans. Multimed. 14(1), 199–210 (2012)CrossRefGoogle Scholar
  44. 44.
    Cheng, B., Yang, J., Yan, S., Fu, Y., Huang, T.S.: Learning with-graph for image analysis. IEEE Trans. Image Process. 19(4), 858–866 (2010)MathSciNetCrossRefGoogle Scholar
  45. 45.
    Cour, T., Benezit, F., Shi, J.: Spectral segmentation with multiscale graph decomposition. In: CVPR, vol. 2, pp. 1124–1131. IEEE (2005)Google Scholar
  46. 46.
    Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRefGoogle Scholar
  47. 47.
    Bao, B.-K., Ni, B., Mu, Y., Yan, S.: Efficient region-aware large graph construction towards scalable multi-label propagation. Pattern Recognit.44(3), 598–606 (2011)CrossRefGoogle Scholar
  48. 48.
    Bao, B.-K., Liu, G., Xu, C., Yan, S.: Inductive robust principal component analysis. IEEE Trans. Image Process. 21(8), 3794–3800 (2012)MathSciNetCrossRefGoogle Scholar
  49. 49.
    Bao, B.-K., Liu, G., Hong, R., Yan, S., Xu, C.: General subspace learning with corrupted training data via graph embedding. IEEE Trans. Image Process. 22(11), 4380–4393 (2013)MathSciNetCrossRefGoogle Scholar
  50. 50.
    Bao, B.-K., Zhu, G., Shen, J., Yan, S.: Robust image analysis with sparse representation on quantized visual features. IEEE Trans. Image Process. 22(3), 860–871 (2013)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Institute of AutomationChinese Academy of ScienceBeijingChina

Personalised recommendations