Minimum volume simplex-based scene representation and attribute recognition with feature fusion

Zou, Zhiyuan; Liu, Weibin; Xing, Weiwei; Zhang, Shunli

doi:10.1007/s10489-022-03697-9

Minimum volume simplex-based scene representation and attribute recognition with feature fusion

Published: 04 August 2022

Volume 53, pages 8959–8977, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Zhiyuan Zou¹,
Weibin Liu¹,
Weiwei Xing² &
…
Shunli Zhang²

176 Accesses
1 Altmetric
Explore all metrics

Abstract

Scene attribute recognition is to identify attribute labels of one scene image based on scene representation for deeper semantic understanding of scenes. In the past decades, numerous algorithms for scene representation have been proposed by feature engineering or deep convolutional neural network. For models based on only one kind of image feature, it is still difficult to learn the representation of multiple attributes from local image region. For models based on deep learning, despite multi-label can be directly used for learning attributes representation, huge training data are usually necessary to build the multi-label model. In this paper, we investigate the problem by the way of scene representation modeling with multi-feature and non-deep learning. Firstly, we introduce linear mixing model (LMM) for scene image modeling, then present a novel approach, referred to as the mini-batch minimum simplex estimation (MMSE), for attribute-based scene representation learning from highly complex image data. Finally, a two-stage multi-feature fusion method is proposed to further improve the feature representation for scene attribute recognition. The proposed method takes advantage of the fast convergence of nonnegative matrix factorization (NMF) schemes, and at the same time using mini-batch to speed up the computation for large-scale scene dataset. The experimental results based on real image scene demonstrate that the proposed method outperforms several other advanced scene attribute recognition approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust Attribute-Based Visual Recognition Using Discriminative Latent Representation

What Visual Attributes Characterize an Object Class?

Boosting Accuracy of Attribute Prediction via SVD and NMF of Instance-Attribute Matrix

References

Yin G, Sheng L, Liu B, Yu N, Wang X, Shao J (2019) Context and attribute grounded dense captioning. In: 2019 IEEE Conference on computer vision and pattern recognition, CVPR 2019, 15–20
Choi S, Kim JT, Choo J (2019) Cars Can’t Fly up in the sky: Improving Urban-Scene Segmentation via Height-driven Attention Networks. In: 2019 IEEE Conference on computer vision and pattern recognition, CVPR 2019, 15–20
Zhang R, Lin L, Wang G, Wang M, Zuo W (2019) Hierarchical scene parsing by weakly supervised learning with image descriptions. IEEE Trans Pattern Anal Mach Intell 41(3):596–610
Article Google Scholar
Sulistiyo AMD, Kawanishi Y, Deguchi D, Hirayama T, Ide I, Zheng JY, Murase H (2018) Attribute-aware Semantic Segmentation of Road Scenes for Understanding Pedestrian Orientations. In: IEEE 21st international conference on intelligent transportation systems, ITSC
Vitor GB, Victorino AC, Ferreira JV (2021) Modeling evidential grids using semantic context information for dynamic scene perception. Knowledge-Based Systems 215:106777
Article Google Scholar
Xie L, Lee F, Liu L, Kotanic K, Chen Q (2020) Scene recognition: A comprehensive survey. Pattern Recognit 102:107205
Article Google Scholar
Zeng H, Song X, Chen G (2020) Learning scene attribute for scene recognition. IEEE IEEE Trans Multimed 22(6):1519– 1530
Article Google Scholar
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. IJCV 42(3):145–175
Article MATH Google Scholar
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Conference on computer vision and pattern recognition, CVPR 2006, 17–22
Patterson G, Xu C, Su H, Hays J (2014) The SUN attribute database: beyond categories for deeper scene understanding. Int J Comput Vis 108:59–81
Article Google Scholar
Elisseeff A, Weston J (2001) A kernel method for multi-labelled classification. In: Proceedings of the 14th international conference on neural information processing systems: natural and synthetic, in: NIPS’01, MIT Press, pp 681–687
Zhang M-L, Zhou Z-H (2007) Ml-knn: a lazy learning approach to multi-label learning. Pattern Recogn 40(7):2038–2048
Article MATH Google Scholar
Chen L, Zhan W, Tian W, He Y, Zou Q (2019) Deep integration: a Multi-Label architecture for road scene recognition. IEEE Trans Image Process 28(10):4883–4898
Article MathSciNet MATH Google Scholar
Song L, Liu J, Qian B, Sun M, Yang K, Sun M, Abbas S (2018) A deep multi-modal CNN for multi-instance multi-label image classification. IEEE Trans Image Process 27(12):6025–6038
Article MathSciNet Google Scholar
Khan N, Chaudhuri U, Banerjee B, Chaudhuri S (2019) Graph convolutional network for multi-label VHR remote sensing scene recognition. Neurocomputing 357:36–46
Article Google Scholar
Wang S, Wnag Y, Zhu SC (2015) Learning hierarchical space tiling for scene modeling, parsing and attribute tagging. IEEE Trans Pattern Anal Mach Intell 37(12):2478–2491
Article Google Scholar
Dalal N, Triggs B (2005) Histogram of oriented gradient object detection. In: 2005 IEEE Conference on computer vision and pattern recognition, CVPR
Lalonde J-F, Hoiem D, Efros AA, Rother C, Winn J, Criminisi A (2007) Photo clip art. ACM Transactions on Graphics 26(3):2007
Article Google Scholar
Shechtman E, Irani M (2007) Matching local self-similarities across images and videos. In: 2007 IEEE Conference on computer vision and pattern recognition, CVPR
Zhu J, Wu T, Zhu S-C, Yang X, Zhang W (2016) A reconfigurable tangram model for scene representation and categorization. IEEE Trans Image Process 25(1):150–166
Article MathSciNet MATH Google Scholar
Tung F, Little JJ (2015) Improving scene attribute recognition using web-scale object detectors. Comput Vis Image Underst 138:86–91
Article Google Scholar
Chen X, Shrivastava A, Gupta A (2013) NEIL: Extracting visual knowledge from web data. In: IEEE International conference on computer vision
Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: delving deep into convolutional nets. In: 2014 British machine vision conference
Deng J, Dong W, Socher R, Li L, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on computer vision and pattern recognition, CVPR
Wang L, Guo S, Huang W, Xiong Y, Qiao Y (2017) Knowledge guided disambiguation for large-scale scene classification with multi-resolution CNNs. IEEE Trans Image Process 26(4):2055–2068
Article MathSciNet MATH Google Scholar
Qi K, Yang C, Shen S (2021) A multi-level improved circle pooling for scene classification of high-resolution remote sensing imagery. Neurocomputing
Yuan X, Qiao Z, Meyarian A (2021) Scale attentive network for scene recognition. Neurocomputing
Lin C, Lee F, Chen Q (2022) Scene recognition using multiple representation network. Applied Soft Computing
Zou Z, Liu W, Xing W (2021) AdaNFF: A new method for adaptive nonnegative multi-feature fusion to scene classification. Pattern Recognit
Nascimento JMP, Bioucas-Dias JM (2005) Vertex component analysis: a fast algorithm to unmix hyperspectral data. IEEE Trans Geosci Remote Sens 43(4):898–910
Article Google Scholar
Li J, Agathos A, Zaharie D, Bioucas-Dias JM, Plaza A, Li X (2015) Minimum volume simplex analysis: a fast algorithm for linear hyperspectral unmixing. IEEE Trans Geosci Remote Sens 53(9):5067–5082
Article Google Scholar
Lin C-H, Chi C-Y, Wang Y-H, Chan T-H (2016) A fast hyperplane-based minimum-volume enclosing simplex algorithm for blind hyper-spectral unmixing. IEEE Transactions on Signal Processing 64(8):1946–196
Article MathSciNet MATH Google Scholar
Zhang S, Agathos A, Li J (2017) Robust minimum volume simplex analysis for hyperspectral unmixing. IEEE Trans Geosci Remote Sens 55(11):6431–6439
Article Google Scholar
Fu X, Huang K, Yang B, Ma W-K, Ni D (2016) sidiropoulos, Robust volume minimization-based matrix factorization for remote sensing and document clustering. IEEE Trans Signal Process 64(23):6254–6268
Article MathSciNet MATH Google Scholar
Leplat V, Ang AMS, Gillis N (2019) Minimum-volume rank-deficient nonnegative matrix factorizations. ICASSP, pp 3402–3406
Marrinan T, Gillis N (2020) Hyperspectral unmixing with rare endmembers via minimax nonnegative matrix factorization. EUSIPCO, pp 1015–1019
Wang X, Zhong Y, Zhang L, Xu Y (2019) Blind hyperspectral unmixing considering the adjacency effect. IEEE Trans Geosci Remote Sens 57(9):6633–6649
Article Google Scholar
Mangai UG, Samanta S, Das S, Roy PC (2010) A survey of decision fusion and feature fusion strategies for pattern classification. IETE Tech Rev 27(4):293–307
Article Google Scholar
Charte D, Charte F, Garcia S, del Jesus MJ, Herrera F (2018) A practical tutorial on autoencoders for nonlinear feature fusion: Taxonomy, models, software and guidelines. Information Fusion 44:78–96
Article Google Scholar
Ma AJ, Yuen PC, Lai JH (2013) Linear dependency modeling for classifier fusion and feature combination. IEEE Trans Pattern Anal Mach Intell 35(5):1135–1148
Article Google Scholar
Baggenstoss PM (2016) Maximum entropy feature fusion. In: International conference on information fusion, pp 1163–1169
Liu Y, Tang A, Cai F, Ren P, Sun Z (2019) Multi-feature based Question–Answerer Model Matching for predicting response time in CQA. Knowledge-Based Systems 182:104794
Article Google Scholar
Shekhar S, Patel VM, Nasrabadi NM, Chellapa R (2014) Joint sparse representation for robust multimodal biometrics recognition. IEEE Trans Pattern Anal Mach Intell 36(1):113–126
Article Google Scholar
Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Susstrunk S (2012) SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34(11):2274–2281
Article Google Scholar
Lin CJ (2007) Projected gradient methods for non-negative matrix factorization. Neural Comput 19(10):2756–2779
Article MathSciNet MATH Google Scholar
Quattoni A, Torralba A (2009) Recognizing indoor scenes. In: 2009 IEEE Conference on computer vision and pattern recognition, CVPR
Zhou B, Lapedriza A, Khosla A, Oliva A, Torralba A (2017) Places: A 10 million Image Database for Scene Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence
Xie L, Lee F, Liu L (2020) Hierarchical coding of convolutional features for scene recognition. IEEE Transactions on Multimedia 22(5):1182–1192
Article Google Scholar
Chenga X, Lub J, Fengb J, Yuan B, Zhou J (2018) Scene recognition with objectness. Pattern Recogn 74:474–487
Article Google Scholar
Liu Y, Chen Q, Chen W, Wassell I (2018) Dictionary learning inspired deep network for scene recognition. In: Proceedings of AAAI conference on artificial intelligence, pp 7178–7185

Download references

Acknowledgements

This research is partially supported by the Beijing Natural Science Foundation (No.4212025), National Natural Science Foundation of China (No.61876018, No.61976017).

Author information

Authors and Affiliations

Institute of Information Science, Beijing Jiaotong University, Beijing, China
Zhiyuan Zou & Weibin Liu
School of Software Engineering, Beijing Jiaotong University, Beijing, China
Weiwei Xing & Shunli Zhang

Authors

Zhiyuan Zou
View author publications
You can also search for this author in PubMed Google Scholar
Weibin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Weiwei Xing
View author publications
You can also search for this author in PubMed Google Scholar
Shunli Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Weibin Liu.

Ethics declarations

Conflict of Interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zou, Z., Liu, W., Xing, W. et al. Minimum volume simplex-based scene representation and attribute recognition with feature fusion. Appl Intell 53, 8959–8977 (2023). https://doi.org/10.1007/s10489-022-03697-9

Download citation

Accepted: 29 April 2022
Published: 04 August 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s10489-022-03697-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Minimum volume simplex-based scene representation and attribute recognition with feature fusion

Abstract

Access this article

Similar content being viewed by others

Robust Attribute-Based Visual Recognition Using Discriminative Latent Representation

What Visual Attributes Characterize an Object Class?

Boosting Accuracy of Attribute Prediction via SVD and NMF of Instance-Attribute Matrix

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Minimum volume simplex-based scene representation and attribute recognition with feature fusion

Abstract

Access this article

Similar content being viewed by others

Robust Attribute-Based Visual Recognition Using Discriminative Latent Representation

What Visual Attributes Characterize an Object Class?

Boosting Accuracy of Attribute Prediction via SVD and NMF of Instance-Attribute Matrix

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation