Skip to main content
Log in

Inductive hierarchical nonnegative graph embedding for “verb–object” image classification

  • Special Issue Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Most existing image classification algorithms mainly focus on dealing with images with only “object” concepts. However, in real-world cases, a great variety of images contain “verb–object” concepts, rather than only “object” ones. The hierarchical structure embedded in these “verb–object” concepts can help to enhance classification. However, traditional feature representation methods cannot utilize it. To tackle this problem, we present in this paper a novel approach, called inductive hierarchical nonnegative graph embedding. By assuming that those “verb–object” concept images which share the same “object” part but different “verb” part have a specific hierarchical structure, we integrate this hierarchical structure into the nonnegative graph embedding technique, together with the definition of inductive matrix, to (1) conduct effective feature extraction from hierarchical structure, (2) easily transfer each new testing sample into its low-dimensional nonnegative representation, and (3) perform image classification of “verb–object” concept images. Extensive experiments compared with the state-of-the-art algorithms on nonnegative data factorization demonstrate the classification power of proposed approach on “verb–object” concept images classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Superscript numbers of matrices, 1, 2, 11, 12, etc., are symbols, not the power in math.

  2. http://images.google.com.

  3. http://www.flickr.com/.

References

  1. Belhumeur, P., Hespanha, J.: Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 711–720 (1997)

    Article  Google Scholar 

  2. Carneiro, G., Chan, A., Moreno, P., Vasconcelos, N.: Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 394–410 (2007)

    Article  Google Scholar 

  3. Ding, C.H., Li, T., Jordan, M.I.: Convex and semi-nonnegative matrix factorizations. IEEE Trans. Pattern Anal. Mach. Intell. 32(1), 45–55 (2010)

    Article  Google Scholar 

  4. Gao, Y., Fan, J., Xue, X., Jain, R.: Automatic image annotation by incorporating feature hierarchy and boosting to scale up svm classifiers. Proceedings of the 14th annual ACM international conference on Multimedia, pp. 901–910. ACM, New York (2006)

  5. Heger, A., Holm, L.: Sensitive pattern discovery with fuzzyalignments of distantly related proteins. Bioinformatics 19(suppl 1), i130–i137 (2003)

    Article  Google Scholar 

  6. Hong, R., Tang, J., Tan, H.-K., Ngo, C.-W., Yan, S., Chua, T.-S.: Beyond search: event-driven summarization for web videos. TOMCCAP 7(4), 35 (2011)

    Google Scholar 

  7. Hong, R., Wang, M., Li, G., Nie, L., Zha, Z.-J., Chua, T.-S.: Multimedia question answering. IEEE Multimed. 19(4), 72–78 (2012)

    Google Scholar 

  8. Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 5, 1457–1469 (2004)

    MathSciNet  MATH  Google Scholar 

  9. Hu, C., Zhang, B., Yan, S., Yang, Q., Yan, J., Chen, Z., Ma, W.: Mining ratio rules via principal sparse non-negative matrix factorization. In Fourth IEEE International Conference on Data Mining, 2004. ICDM’04, pp. 407–410. IEEE (2004)

  10. Kim, P., Tidor, B.: Subsystem identification through dimensionality reduction of large-scale gene expression data. Genome Res. 13(7), 1706–1718 (2003)

    Article  Google Scholar 

  11. Kuhn, H.W., Tucker, A.W.: Nonlinear programming. In: Second Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 481–492 (1951)

  12. Lee, D., Seung, H., et al.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)

    Article  Google Scholar 

  13. Li, L., Jiang, S., Huang, Q.: Learning hierarchical semantic description via mixed-norm regularization for image understanding. IEEE Trans. Multimed. 14(5), 1401–1413 (2012)

    Article  Google Scholar 

  14. Li, L.-J., Su, H., Fei-Fei, L., Xing, E.P.: Object bank: a high-level image representation for scene classification & semantic feature sparsification, pp. 1378–1386. In: Advances in Neural Information Processing Systems (2010)

  15. Li, S.Z., Hou, X.W., Zhang, H.J., Cheng, Q.S.: Learning spatially localized, parts-based representation. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001. CVPR 2001, vol. 1, pp. I-207. IEEE (2001)

  16. Liu, X., Yan, S., Jin, H.: Projective nonnegative graph embedding. IEEE Trans. Image Process. 19(5), 1126–1137 (2010)

    Article  MathSciNet  Google Scholar 

  17. Ramanath, R., Kuehni, R., Snyder, W., Hinks, D.: Spectral spaces and color spaces. Color Res. Appl. 29(1), 29–37 (2004)

    Article  Google Scholar 

  18. Ramanath, R., Snyder, W., Qi, H.: Eigenviews for object recognition in multispectral imaging systems. In: Applied Imagery Pattern Recognition Workshop, 2003. Proceedings. 32nd, pp. 33–38. IEEE (2003)

  19. Sun, C., Bao, B.-K., Xu, C.: Verb-object concepts image classification via hierarchical nonnegative graph embedding. In: Proceeding of 19th International Conference on Multimedia Modeling (MMM), pp. 58–69 (2013)

  20. Wang, C., Song, Z., Yan, S., Zhang, L., Zhang, H.: Multiplicative nonnegative graph embedding. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 389–396. IEEE (2009)

  21. Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010, pp. 3360–3367. IEEE (2010)

  22. Wang, M., Hong, R., Li, G., Zha, Z.-J., Yan, S., Chua, T.-S.: Event driven web video summarization by tag localization and key-shot identification. IEEE Trans. Multimed. 14(4), 975–985 (2012)

    Google Scholar 

  23. Wang, Y., Jia, Y.: Fisher non-negative matrix factorization for learning local features. In: Proc. Asian Conf. on Comp. Vision, Citeseer (2004)

  24. Yan, S., Xu, D., Zhang, B., Zhang, H., Yang, Q., Lin, S.: Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans. Pattern Anal. Mach. Intell. 29(1), 40–51 (2007)

    Google Scholar 

  25. Yang, J., Yang, S., Fu, Y., Li, X., Huang, T.: Non-negative graph embedding. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008, pp. 1–8. IEEE (2008)

  26. Yao, B., Jiang, X., Khosla, A., Lin, A., Guibas, L., Fei-Fei, L.: Human action recognition by learning bases of action attributes and parts. In: IEEE International Conference on Computer Vision (ICCV), 2011, pp. 1331–1338. IEEE (2011)

  27. Yun, X.: Non-negative matrix factorization for face recognition. PhD thesis, Hong Kong Baptist University (2007)

  28. Zhang, X., Zha, Z., Xu, C.: Learning verb-object concepts for semantic image annotation. Proceedings of the 19th ACM International Conference on Multimedia, pp. 1077–1080. ACM, New York (2011)

Download references

Acknowledgments

This work is supported in part by National Basic Research Program of China (No. 2012CB316304), National Natural Science Foundation of China (No. 61225009, No. 61201374) and Beijing Natural Science Foundation (No. 4131004). This work is also supported by the Singapore National Research Foundation under its International Research Centre@Singapore Funding Initiative and administered by the IDM Programme Office.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Changsheng Xu.

Appendix

Appendix

Here we present the convergence proof of update rule for both matrix \(W\) and matrix \(C\).

1.1 Preliminaries

First of all, we introduce the concept of auxiliary function and the lemma which will be used for algorithmic derivation.

Definition 1

Function \(G(A,A')\) is an auxiliary function for function \(F(A)\) if the following conditions are satisfied

$$\begin{aligned} G(A,A') \ge F(A), \qquad G(A,A) = F(A) \end{aligned}$$
(44)

From this definition, we have the following lemma with proof omitted [12].

Lemma 1

If \(G\) is an auxiliary function, then \(F\) is non-increasing under the update

$$\begin{aligned} A^{t+1} = \mathrm{arg\, min}_A G(A,A') \end{aligned}$$
(45)

where \(t\) denotes the \(t\)th iteration.

1.2 Convergence proof of update rule for W

Let \(F_{ij}\) as the part of \(F(W)\) relevant to \(W_{ij}\), we have

$$\begin{aligned} F'_{ij}(W)&= [W(2Q^1 +2Q^2 +2Q^3) +2\lambda W C X X^T C^T \nonumber \\&- 2\lambda X X^T C^T]_{ij}\end{aligned}$$
(46)
$$\begin{aligned} F''_{ij}(W)&= [2(Q^1 +Q^2 +Q^3) +2\lambda C X X^T C^T]_{jj} \end{aligned}$$
(47)

The auxiliary function of \(F_{ij}\) is then designed as

$$\begin{aligned} G(W_{ij}, W^t_{ij})&= F_{ij}(W^t_{ij}) + F'_{ij}(W_{ij})(W_{ij} - W^t_{ij}) \nonumber \\&+ \frac{[W^t (Q^1_+ \!+\!Q^2_+ \!+\!Q^3_+) \!+\! \lambda \!W^t \!C X X^T C^T]_{ij}}{\!W^t_{ij}} \nonumber \\&\times (W_{ij} - W^t_{ij})^2 \end{aligned}$$
(48)

Lemma 2

Equation (48) is an auxiliary function for \(F_{ij}\), which is the part of \(F(W)\) relevant to \(W_{ij}\).

Proof

Obviously, \(G(W_{ij}, W_{ij}) = F_{ij}(W_{ij})\). We only need to prove that \(G(W_{ij}, W^t_{ij}) \ge F_{ij}(W_{ij})\).

First, we have the Taylor series expansion of \(F_{ij}\)

$$\begin{aligned} F_{ij}(W_{ij})&=F_{ij}(W^t_{ij}) + F'_{ij}(W^t_{ij})(W_{ij} - W^t_{ij}) \nonumber \\&\quad + \frac{1}{2} F''_{ij}(W^t_{ij})(W_{ij} - W^t_{ij})^2 \end{aligned}$$
(49)

Then, it is easy to verify that

$$\begin{aligned}&[\lambda W^t C X X^T C^T]_{ij} \ge W^t_{ij} [\lambda C X X^T C^T]_{jj}\end{aligned}$$
(50)
$$\begin{aligned}&[W^t (Q^1_+ +Q^2_+ +Q^3_+)]_{ij} \ge W^t_{ij} [(Q^1_+ +Q^2_+ +Q^3_+)]_{jj}\nonumber \\ \end{aligned}$$
(51)

Thus we have

$$\begin{aligned}&\frac{[W^t (Q^1_+ +Q^2_+ +Q^3_+) + \lambda W^t C X X^T C^T]_{ij}}{W^t_{ij}} \nonumber \\&\quad \ge [(Q^1 +Q^2 +Q^3) +\lambda C X X^T C^T]_{jj} \end{aligned}$$
(52)

Then, \(G(W_{ij}, W^t_{ij}) \ge F_{ij}(W_{ij})\) holds.

Lemma 3

Equation (34) could be obtained by minimizing the auxiliary function \(G(W_{ij}, W^t_{ij})\).

Proof

Let \(\partial G(W_{ij}, W^t_{ij}) / \partial W_{ij} = 0\), we have

$$\begin{aligned}&F'_{ij}(W^t_{ij}) \!+\! 2 \frac{[W^t (Q^1_+ \!+\!Q^2_+ +Q^3_+) \!+\! \lambda W^t C X X^T C^T]_{ij}}{W^t_{ij}}\times \nonumber \\&\quad (W_{ij} - W^t_{ij}) = 0 \end{aligned}$$
(53)

Finally we can obtain the update rule for \(W\)

$$\begin{aligned} W^{t+1}_{ij} \leftarrow W^t_{ij} \frac{[\lambda X X^T C^T + W^t(Q^1_- + Q^2_- + Q^3_-)]_{ij}}{[\lambda W^t C X X^T C^T + W^t(Q^1_+ +Q^2_+ + Q^3_+)]_{ij}}\nonumber \\ \end{aligned}$$
(54)

and the lemma is proved.

1.3 Convergence proof of update rule for C

Let \(F_{ij}\) as the part of \(F(C)\) relevant to \(C_{ij}\), we have

$$\begin{aligned} F'_{ij}(C)&= [2{R^1}^T R^1 C X L^u X^T + 2{R^2}^T R^2 C X \tilde{L} X^T \nonumber \\&\quad + 2{R^3}^T R^3 C X \tilde{L}^u X^T - 2 \lambda W^T X X^T \nonumber \\&\quad + 2 \lambda W^T W C X X^T]_{ij}\end{aligned}$$
(55)
$$\begin{aligned} F''_{ij}(C)&= 2[{R^1}^T R^1]_{ii} [X L^u X^T]_{jj} + 2[{R^2}^T R^2]_{ii} [X \tilde{L} X^T]_{jj} \nonumber \\&\quad + 2[{R^3}^T R^3]_{ii} [X \tilde{L}^u X^T]_{jj} + 2 \lambda [W^T W]_{ii} [X X^T]_{jj} \nonumber \\ \end{aligned}$$
(56)

The auxiliary function of \(F_{ij}\) is then designed as

$$\begin{aligned}&G(C_{ij}, C^t_{ij}) \nonumber \\&\quad = F_{ij}(C^t_{ij}) + F'_{ij}(C_{ij})(C_{ij} - C^t_{ij}) \nonumber \\&\quad \quad + [{R^1}^T R^1 C^t X D^u X^T + {R^2}^T R^2 C^t X \tilde{D} X^T \nonumber \\&\quad \quad + {R^3}^T R^3 C^t X \tilde{D}^u X^T + \lambda W^T W C^t X X^T]_{ij} / C^t_{ij} \nonumber \\&\quad \quad \times (C_{ij} - C^t_{ij})^2 \end{aligned}$$
(57)

Lemma 4

Equation (57) is an auxiliary function for \(F_{ij}\), which is the part of \(F(C)\) relevant to \(C_{ij}\).

Proof

Obviously, \(G(C_{ij}, C_{ij}) = F_{ij}(C_{ij})\). We only need to prove that \(G(C_{ij}, C^t_{ij}) \ge F_{ij}(C_{ij})\).

First, we have the Taylor series expansion of \(F_{ij}\)

$$\begin{aligned} F_{ij}(C_{ij})&= F_{ij}(C^t_{ij}) + F'_{ij}(C^t_{ij})(C_{ij} - C^t_{ij}) \nonumber \\&\quad + \frac{1}{2} F''_{ij}(C^t_{ij})(C_{ij} - C^t_{ij})^2 \end{aligned}$$
(58)

Then, it is easy to verify that

$$\begin{aligned}&[W^T W C X X^T]_{ij} \ge C^t_{ij} [W^T W]_{ii} [X X^T]_{jj}\end{aligned}$$
(59)
$$\begin{aligned}&[{R^1}^T R^1 C^t X D^u X^T]_{ij} \ge [{R^1}^T R^1]_{ii} C^t_{ij} [X L^u X^T]_{jj}\end{aligned}$$
(60)
$$\begin{aligned}&[{R^2}^T R^2 C^t X \tilde{D} X^T]_{ij} \ge [{R^2}^T R^2]_{ii} C^t_{ij} [X \tilde{L} X^T]_{jj}\end{aligned}$$
(61)
$$\begin{aligned}&[{R^3}^T R^3 C X \tilde{D}^u X^T]_{ij} \ge [{R^3}^T R^3]_{ii} C^t_{ij} [X \tilde{L}^u X^T]_{jj} \end{aligned}$$
(62)

Thus we have \(G(C_{ij}, C^t_{ij}) \ge F_{ij}(C_{ij})\).

Lemma 5

Equation (43) could be obtained by minimizing the auxiliary function \(G(C_{ij}, C^t_{ij})\).

Proof

Let \(\partial G(C_{ij}, C^t_{ij}) \ / \ \partial C_{ij} = 0\), we have

$$\begin{aligned}&F'_{ij}(C_{ij}) + [{R^1}^T R^1 C^t X D^u X^T + {R^2}^T R^2 C^t X \tilde{D} X^T \nonumber \\&\quad + {R^3}^T R^3 C^t X \tilde{D}^u X^T + \lambda W^T W C^t X X^T]_{ij} / C^t_{ij} \nonumber \\&\quad \cdot (C_{ij} - C^t_{ij}) = 0 \end{aligned}$$
(63)

Finally we can obtain the update rule for \(C\)

$$\begin{aligned}&C^{t+1}_{ij} \leftarrow C^t_{ij} \cdot [\lambda W^T X X^T + {R^1}^T R^1 C^t X S^u X^T \nonumber \\&\quad + {R^2}^T R^2 C^t X \tilde{S} X^T + {R^3}^T R^3 C^t X \tilde{S}^u X^T]_{ij} \nonumber \\&\quad / \ [\lambda W^T W C^t X X^T + {R^1}^T R^1 C^t X D^u X^T \nonumber \\&\quad + {R^2}^T R^2 C^t X \tilde{D} X^T + {R^3}^T R^3 C^t X \tilde{D}^u X^T]_{ij} \end{aligned}$$
(64)

and the lemma is proved. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, C., Bao, BK. & Xu, C. Inductive hierarchical nonnegative graph embedding for “verb–object” image classification. Machine Vision and Applications 25, 1647–1659 (2014). https://doi.org/10.1007/s00138-013-0548-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-013-0548-3

Keywords

Navigation