Inductive hierarchical nonnegative graph embedding for “verb–object” image classification

Sun, Chao; Bao, Bing-Kun; Xu, Changsheng

doi:10.1007/s00138-013-0548-3

Inductive hierarchical nonnegative graph embedding for “verb–object” image classification

Special Issue Paper
Published: 11 October 2013

Volume 25, pages 1647–1659, (2014)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Chao Sun^1,2,
Bing-Kun Bao^1,2 &
Changsheng Xu^1,2

262 Accesses
2 Citations
Explore all metrics

Abstract

Most existing image classification algorithms mainly focus on dealing with images with only “object” concepts. However, in real-world cases, a great variety of images contain “verb–object” concepts, rather than only “object” ones. The hierarchical structure embedded in these “verb–object” concepts can help to enhance classification. However, traditional feature representation methods cannot utilize it. To tackle this problem, we present in this paper a novel approach, called inductive hierarchical nonnegative graph embedding. By assuming that those “verb–object” concept images which share the same “object” part but different “verb” part have a specific hierarchical structure, we integrate this hierarchical structure into the nonnegative graph embedding technique, together with the definition of inductive matrix, to (1) conduct effective feature extraction from hierarchical structure, (2) easily transfer each new testing sample into its low-dimensional nonnegative representation, and (3) perform image classification of “verb–object” concept images. Extensive experiments compared with the state-of-the-art algorithms on nonnegative data factorization demonstrate the classification power of proposed approach on “verb–object” concept images classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Verb-Object Concepts Image Classification via Hierarchical Nonnegative Graph Embedding

Semantic embedding: scene image classification using scene-specific objects

Article 18 October 2022

The Latent Semantic Power of Labels: Improving Image Classification via Natural Language Semantic

Notes

Superscript numbers of matrices, 1, 2, 11, 12, etc., are symbols, not the power in math.
http://images.google.com.
http://www.flickr.com/.

References

Belhumeur, P., Hespanha, J.: Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 711–720 (1997)
Article Google Scholar
Carneiro, G., Chan, A., Moreno, P., Vasconcelos, N.: Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 394–410 (2007)
Article Google Scholar
Ding, C.H., Li, T., Jordan, M.I.: Convex and semi-nonnegative matrix factorizations. IEEE Trans. Pattern Anal. Mach. Intell. 32(1), 45–55 (2010)
Article Google Scholar
Gao, Y., Fan, J., Xue, X., Jain, R.: Automatic image annotation by incorporating feature hierarchy and boosting to scale up svm classifiers. Proceedings of the 14th annual ACM international conference on Multimedia, pp. 901–910. ACM, New York (2006)
Heger, A., Holm, L.: Sensitive pattern discovery with fuzzyalignments of distantly related proteins. Bioinformatics 19(suppl 1), i130–i137 (2003)
Article Google Scholar
Hong, R., Tang, J., Tan, H.-K., Ngo, C.-W., Yan, S., Chua, T.-S.: Beyond search: event-driven summarization for web videos. TOMCCAP 7(4), 35 (2011)
Google Scholar
Hong, R., Wang, M., Li, G., Nie, L., Zha, Z.-J., Chua, T.-S.: Multimedia question answering. IEEE Multimed. 19(4), 72–78 (2012)
Google Scholar
Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 5, 1457–1469 (2004)
MathSciNet MATH Google Scholar
Hu, C., Zhang, B., Yan, S., Yang, Q., Yan, J., Chen, Z., Ma, W.: Mining ratio rules via principal sparse non-negative matrix factorization. In Fourth IEEE International Conference on Data Mining, 2004. ICDM’04, pp. 407–410. IEEE (2004)
Kim, P., Tidor, B.: Subsystem identification through dimensionality reduction of large-scale gene expression data. Genome Res. 13(7), 1706–1718 (2003)
Article Google Scholar
Kuhn, H.W., Tucker, A.W.: Nonlinear programming. In: Second Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 481–492 (1951)
Lee, D., Seung, H., et al.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)
Article Google Scholar
Li, L., Jiang, S., Huang, Q.: Learning hierarchical semantic description via mixed-norm regularization for image understanding. IEEE Trans. Multimed. 14(5), 1401–1413 (2012)
Article Google Scholar
Li, L.-J., Su, H., Fei-Fei, L., Xing, E.P.: Object bank: a high-level image representation for scene classification & semantic feature sparsification, pp. 1378–1386. In: Advances in Neural Information Processing Systems (2010)
Li, S.Z., Hou, X.W., Zhang, H.J., Cheng, Q.S.: Learning spatially localized, parts-based representation. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001. CVPR 2001, vol. 1, pp. I-207. IEEE (2001)
Liu, X., Yan, S., Jin, H.: Projective nonnegative graph embedding. IEEE Trans. Image Process. 19(5), 1126–1137 (2010)
Article MathSciNet Google Scholar
Ramanath, R., Kuehni, R., Snyder, W., Hinks, D.: Spectral spaces and color spaces. Color Res. Appl. 29(1), 29–37 (2004)
Article Google Scholar
Ramanath, R., Snyder, W., Qi, H.: Eigenviews for object recognition in multispectral imaging systems. In: Applied Imagery Pattern Recognition Workshop, 2003. Proceedings. 32nd, pp. 33–38. IEEE (2003)
Sun, C., Bao, B.-K., Xu, C.: Verb-object concepts image classification via hierarchical nonnegative graph embedding. In: Proceeding of 19th International Conference on Multimedia Modeling (MMM), pp. 58–69 (2013)
Wang, C., Song, Z., Yan, S., Zhang, L., Zhang, H.: Multiplicative nonnegative graph embedding. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 389–396. IEEE (2009)
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010, pp. 3360–3367. IEEE (2010)
Wang, M., Hong, R., Li, G., Zha, Z.-J., Yan, S., Chua, T.-S.: Event driven web video summarization by tag localization and key-shot identification. IEEE Trans. Multimed. 14(4), 975–985 (2012)
Google Scholar
Wang, Y., Jia, Y.: Fisher non-negative matrix factorization for learning local features. In: Proc. Asian Conf. on Comp. Vision, Citeseer (2004)
Yan, S., Xu, D., Zhang, B., Zhang, H., Yang, Q., Lin, S.: Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans. Pattern Anal. Mach. Intell. 29(1), 40–51 (2007)
Google Scholar
Yang, J., Yang, S., Fu, Y., Li, X., Huang, T.: Non-negative graph embedding. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008, pp. 1–8. IEEE (2008)
Yao, B., Jiang, X., Khosla, A., Lin, A., Guibas, L., Fei-Fei, L.: Human action recognition by learning bases of action attributes and parts. In: IEEE International Conference on Computer Vision (ICCV), 2011, pp. 1331–1338. IEEE (2011)
Yun, X.: Non-negative matrix factorization for face recognition. PhD thesis, Hong Kong Baptist University (2007)
Zhang, X., Zha, Z., Xu, C.: Learning verb-object concepts for semantic image annotation. Proceedings of the 19th ACM International Conference on Multimedia, pp. 1077–1080. ACM, New York (2011)

Download references

Acknowledgments

This work is supported in part by National Basic Research Program of China (No. 2012CB316304), National Natural Science Foundation of China (No. 61225009, No. 61201374) and Beijing Natural Science Foundation (No. 4131004). This work is also supported by the Singapore National Research Foundation under its International Research Centre@Singapore Funding Initiative and administered by the IDM Programme Office.

Author information

Authors and Affiliations

National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China
Chao Sun, Bing-Kun Bao & Changsheng Xu
China-Singapore Institute of Digital Media, Singapore, Singapore
Chao Sun, Bing-Kun Bao & Changsheng Xu

Authors

Chao Sun
View author publications
You can also search for this author in PubMed Google Scholar
Bing-Kun Bao
View author publications
You can also search for this author in PubMed Google Scholar
Changsheng Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Changsheng Xu.

Appendix

Here we present the convergence proof of update rule for both matrix $W$ and matrix $C$.

1.1 Preliminaries

First of all, we introduce the concept of auxiliary function and the lemma which will be used for algorithmic derivation.

Definition 1

Function $G(A,A')$ is an auxiliary function for function $F(A)$ if the following conditions are satisfied

$$\begin{aligned} G(A,A') \ge F(A), \qquad G(A,A) = F(A) \end{aligned}$$

(44)

From this definition, we have the following lemma with proof omitted [12].

Lemma 1

If $G$ is an auxiliary function, then $F$ is non-increasing under the update

$$\begin{aligned} A^{t+1} = \mathrm{arg\, min}_A G(A,A') \end{aligned}$$

(45)

where $t$ denotes the $t$th iteration.

1.2 Convergence proof of update rule for W

Let $F_{ij}$ as the part of $F(W)$ relevant to $W_{ij}$, we have

$$\begin{aligned} F'_{ij}(W)&= [W(2Q^1 +2Q^2 +2Q^3) +2\lambda W C X X^T C^T \nonumber \\&- 2\lambda X X^T C^T]_{ij}\end{aligned}$$

(46)

$$\begin{aligned} F''_{ij}(W)&= [2(Q^1 +Q^2 +Q^3) +2\lambda C X X^T C^T]_{jj} \end{aligned}$$

(47)

The auxiliary function of $F_{ij}$ is then designed as

$$\begin{aligned} G(W_{ij}, W^t_{ij})&= F_{ij}(W^t_{ij}) + F'_{ij}(W_{ij})(W_{ij} - W^t_{ij}) \nonumber \\&+ \frac{[W^t (Q^1_+ \!+\!Q^2_+ \!+\!Q^3_+) \!+\! \lambda \!W^t \!C X X^T C^T]_{ij}}{\!W^t_{ij}} \nonumber \\&\times (W_{ij} - W^t_{ij})^2 \end{aligned}$$

(48)

Lemma 2

Equation (48) is an auxiliary function for $F_{ij}$, which is the part of $F(W)$ relevant to $W_{ij}$.

Proof

Obviously, $G(W_{ij}, W_{ij}) = F_{ij}(W_{ij})$. We only need to prove that $G(W_{ij}, W^t_{ij}) \ge F_{ij}(W_{ij})$.

First, we have the Taylor series expansion of $F_{ij}$

$$\begin{aligned} F_{ij}(W_{ij})&=F_{ij}(W^t_{ij}) + F'_{ij}(W^t_{ij})(W_{ij} - W^t_{ij}) \nonumber \\&\quad + \frac{1}{2} F''_{ij}(W^t_{ij})(W_{ij} - W^t_{ij})^2 \end{aligned}$$

(49)

Then, it is easy to verify that

$$\begin{aligned}&[\lambda W^t C X X^T C^T]_{ij} \ge W^t_{ij} [\lambda C X X^T C^T]_{jj}\end{aligned}$$

(50)

$$\begin{aligned}&[W^t (Q^1_+ +Q^2_+ +Q^3_+)]_{ij} \ge W^t_{ij} [(Q^1_+ +Q^2_+ +Q^3_+)]_{jj}\nonumber \\ \end{aligned}$$

(51)

Thus we have

$$\begin{aligned}&\frac{[W^t (Q^1_+ +Q^2_+ +Q^3_+) + \lambda W^t C X X^T C^T]_{ij}}{W^t_{ij}} \nonumber \\&\quad \ge [(Q^1 +Q^2 +Q^3) +\lambda C X X^T C^T]_{jj} \end{aligned}$$

(52)

Then, $G(W_{ij}, W^t_{ij}) \ge F_{ij}(W_{ij})$ holds.

Lemma 3

Equation (34) could be obtained by minimizing the auxiliary function $G(W_{ij}, W^t_{ij})$.

Proof

Let $\partial G(W_{ij}, W^t_{ij}) / \partial W_{ij} = 0$, we have

$$\begin{aligned}&F'_{ij}(W^t_{ij}) \!+\! 2 \frac{[W^t (Q^1_+ \!+\!Q^2_+ +Q^3_+) \!+\! \lambda W^t C X X^T C^T]_{ij}}{W^t_{ij}}\times \nonumber \\&\quad (W_{ij} - W^t_{ij}) = 0 \end{aligned}$$

(53)

Finally we can obtain the update rule for $W$

$$\begin{aligned} W^{t+1}_{ij} \leftarrow W^t_{ij} \frac{[\lambda X X^T C^T + W^t(Q^1_- + Q^2_- + Q^3_-)]_{ij}}{[\lambda W^t C X X^T C^T + W^t(Q^1_+ +Q^2_+ + Q^3_+)]_{ij}}\nonumber \\ \end{aligned}$$

(54)

and the lemma is proved.

1.3 Convergence proof of update rule for C

Let $F_{ij}$ as the part of $F(C)$ relevant to $C_{ij}$, we have

$$\begin{aligned} F'_{ij}(C)&= [2{R^1}^T R^1 C X L^u X^T + 2{R^2}^T R^2 C X \tilde{L} X^T \nonumber \\&\quad + 2{R^3}^T R^3 C X \tilde{L}^u X^T - 2 \lambda W^T X X^T \nonumber \\&\quad + 2 \lambda W^T W C X X^T]_{ij}\end{aligned}$$

(55)

$$\begin{aligned} F''_{ij}(C)&= 2[{R^1}^T R^1]_{ii} [X L^u X^T]_{jj} + 2[{R^2}^T R^2]_{ii} [X \tilde{L} X^T]_{jj} \nonumber \\&\quad + 2[{R^3}^T R^3]_{ii} [X \tilde{L}^u X^T]_{jj} + 2 \lambda [W^T W]_{ii} [X X^T]_{jj} \nonumber \\ \end{aligned}$$

(56)

The auxiliary function of $F_{ij}$ is then designed as

$$\begin{aligned}&G(C_{ij}, C^t_{ij}) \nonumber \\&\quad = F_{ij}(C^t_{ij}) + F'_{ij}(C_{ij})(C_{ij} - C^t_{ij}) \nonumber \\&\quad \quad + [{R^1}^T R^1 C^t X D^u X^T + {R^2}^T R^2 C^t X \tilde{D} X^T \nonumber \\&\quad \quad + {R^3}^T R^3 C^t X \tilde{D}^u X^T + \lambda W^T W C^t X X^T]_{ij} / C^t_{ij} \nonumber \\&\quad \quad \times (C_{ij} - C^t_{ij})^2 \end{aligned}$$

(57)

Lemma 4

Equation (57) is an auxiliary function for $F_{ij}$, which is the part of $F(C)$ relevant to $C_{ij}$.

Proof

Obviously, $G(C_{ij}, C_{ij}) = F_{ij}(C_{ij})$. We only need to prove that $G(C_{ij}, C^t_{ij}) \ge F_{ij}(C_{ij})$.

First, we have the Taylor series expansion of $F_{ij}$

$$\begin{aligned} F_{ij}(C_{ij})&= F_{ij}(C^t_{ij}) + F'_{ij}(C^t_{ij})(C_{ij} - C^t_{ij}) \nonumber \\&\quad + \frac{1}{2} F''_{ij}(C^t_{ij})(C_{ij} - C^t_{ij})^2 \end{aligned}$$

(58)

Then, it is easy to verify that

$$\begin{aligned}&[W^T W C X X^T]_{ij} \ge C^t_{ij} [W^T W]_{ii} [X X^T]_{jj}\end{aligned}$$

(59)

$$\begin{aligned}&[{R^1}^T R^1 C^t X D^u X^T]_{ij} \ge [{R^1}^T R^1]_{ii} C^t_{ij} [X L^u X^T]_{jj}\end{aligned}$$

(60)

$$\begin{aligned}&[{R^2}^T R^2 C^t X \tilde{D} X^T]_{ij} \ge [{R^2}^T R^2]_{ii} C^t_{ij} [X \tilde{L} X^T]_{jj}\end{aligned}$$

(61)

$$\begin{aligned}&[{R^3}^T R^3 C X \tilde{D}^u X^T]_{ij} \ge [{R^3}^T R^3]_{ii} C^t_{ij} [X \tilde{L}^u X^T]_{jj} \end{aligned}$$

(62)

Thus we have $G(C_{ij}, C^t_{ij}) \ge F_{ij}(C_{ij})$.

Lemma 5

Equation (43) could be obtained by minimizing the auxiliary function $G(C_{ij}, C^t_{ij})$.

Proof

Let $\partial G(C_{ij}, C^t_{ij}) \ / \ \partial C_{ij} = 0$, we have

$$\begin{aligned}&F'_{ij}(C_{ij}) + [{R^1}^T R^1 C^t X D^u X^T + {R^2}^T R^2 C^t X \tilde{D} X^T \nonumber \\&\quad + {R^3}^T R^3 C^t X \tilde{D}^u X^T + \lambda W^T W C^t X X^T]_{ij} / C^t_{ij} \nonumber \\&\quad \cdot (C_{ij} - C^t_{ij}) = 0 \end{aligned}$$

(63)

Finally we can obtain the update rule for $C$

$$\begin{aligned}&C^{t+1}_{ij} \leftarrow C^t_{ij} \cdot [\lambda W^T X X^T + {R^1}^T R^1 C^t X S^u X^T \nonumber \\&\quad + {R^2}^T R^2 C^t X \tilde{S} X^T + {R^3}^T R^3 C^t X \tilde{S}^u X^T]_{ij} \nonumber \\&\quad / \ [\lambda W^T W C^t X X^T + {R^1}^T R^1 C^t X D^u X^T \nonumber \\&\quad + {R^2}^T R^2 C^t X \tilde{D} X^T + {R^3}^T R^3 C^t X \tilde{D}^u X^T]_{ij} \end{aligned}$$

(64)

and the lemma is proved. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, C., Bao, BK. & Xu, C. Inductive hierarchical nonnegative graph embedding for “verb–object” image classification. Machine Vision and Applications 25, 1647–1659 (2014). https://doi.org/10.1007/s00138-013-0548-3

Download citation

Received: 20 February 2013
Revised: 07 August 2013
Accepted: 19 September 2013
Published: 11 October 2013
Issue Date: October 2014
DOI: https://doi.org/10.1007/s00138-013-0548-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Inductive hierarchical nonnegative graph embedding for “verb–object” image classification

Abstract

Access this article

Similar content being viewed by others

Verb-Object Concepts Image Classification via Hierarchical Nonnegative Graph Embedding

Semantic embedding: scene image classification using scene-specific objects

The Latent Semantic Power of Labels: Improving Image Classification via Natural Language Semantic

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

1.1 Preliminaries

Definition 1

Lemma 1

1.2 Convergence proof of update rule for W

Lemma 2

Proof

Lemma 3

Proof

1.3 Convergence proof of update rule for C

Lemma 4

Proof

Lemma 5

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Inductive hierarchical nonnegative graph embedding for “verb–object” image classification

Abstract

Access this article

Similar content being viewed by others

Verb-Object Concepts Image Classification via Hierarchical Nonnegative Graph Embedding

Semantic embedding: scene image classification using scene-specific objects

The Latent Semantic Power of Labels: Improving Image Classification via Natural Language Semantic

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

1.1 Preliminaries

Definition 1

Lemma 1

1.2 Convergence proof of update rule for W

Lemma 2

Proof

Lemma 3

Proof

1.3 Convergence proof of update rule for C

Lemma 4

Proof

Lemma 5

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation