Abstract
This paper introduces a novel and simple approach of high-level scene classification. Knowing that objects are the essence of any given scene, the proposed method uses them to construct a well-structured background knowledge, which is composed of ranking functions and a statistical collection, in order to support the scene classification process. Since not all objects are relevant, only the most salient ones are identified and used in computing the appropriate scene category. To prove the efficiency of the proposed method, experiments are conducted on state of the art datasets: MIT Indoor, SUN900, SUN2012, SUN397 and LabelMe+.Comparisons with other methods were also introduced. The obtained results are reported and discussed.
Similar content being viewed by others
References
Abkenar MR, Sadreazami H, Ahmad MO (2019) Graph-based salient object detection using background and foreground connectivity cues. 2019 IEEE International Symposium on Circuits and Systems (ISCAS)
Aditya S (2017) Explainable image understanding using vision and reasoning. Proceedings of the AAAI Conference on Artificial Intelligence, vol 31
Aditya S et al (2018) Image understanding using vision and reasoning through scene description graph. Comput Vis Image Understand 173:33–45
Alajaji D, Alhichri H (2020) Few shot scene classification in remote sensing using meta-agnostic machine. 2020 6th conference on data science and machine learning applications (CDMA)
Ali N et al (2018) A hybrid geometric spatial image representation for scene classification. PLoS One 13
Anbarasu B, Anitha G (2018) Indoor scene recognition for micro aerial vehicles navigation using enhanced-GIST descriptors. Def Sci J 68:129
Bagschik G, Menzel T, Maurer M (2018) Ontology based scene creation for the development of automated vehicles. In 2018 IEEE Intelligent Vehicles Symposium (IV), pp 1813-1820
Bai X, Yang M, Lyu P, Xu Y, Luo J (2018) Integrating scene text and visual appearance for fine-grained image classification. IEEE Access 6:66322–66335
Biederman R, Mezzanotte J, Rabinowitz JC (1982) Scene perception: detecting and judging objects undergoing relational violations. Cogn Psychol 14:143–177
Borji A et al (2019) Salient object detection: A survey. Computational Visual Media 5:117–150
Bosch AZ, Muñoz X (2006) Scene classification via pLSA. European conference on computer vision
Brady TF, Konkle T, Alvarez GA, Oliva A (2008) Visual long-term memory has a massive storage capacity for object details. Proc Natl Acad Sci 105:14325–14329
Brown M, Süsstrunk S (2011) Multi-spectral SIFT for scene category recognition» CVPR 2011. IEEE
Cakir F, Güdükbay U, Ulusoy Ö (2011) Nearest-neighbor based metric functions for indoor scene recognition. Comput Vis Image Underst 115:1483–1492
Choi MJ, Torralba A, Willsky AS (2012) Context models and out-of-context objects. Pattern Recogn Lett 33:853–862
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. Proceedings of the IEEE conference on computer vision and pattern recognition
Donadello I (2018) Semantic, «image interpretation-integration of numerical data and logical knowledge for cognitive vision». Diss. University of Trento
Donadello LS, AD Garcez(2017) Logic tensor networks for semantic image interpretation arXiv preprint arXiv:1705.08968
Dubey R, Peterson J, Khosla A, Yang M-H, Ghanem B (2015) What makes an object memorable? Proceedings of the IEEE international conference on computer vision
Einhäuser W, Spain M, Perona P (2008) Objects predict fixations better than early saliency. J Vis 8:18–18
Feng YL, Wu L (2017) Bag of visual words model with deep spatial features for geographical scene classification. Computational Intelligence and Neuroscience
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks», arXiv preprint arXiv:1703.03400
Finn C, Xu K, Levine S (2018) Probabilistic model-agnostic meta-learning. Advances in Neural Information Processing Systems
Fu Q, Fu H, Yan H, Zhou B, Chen X, Li X (2020) Human-centric metrics for indoor scene assessment and synthesis. Graph Models 110
Galleguillos C, Belongie S (2010) Context based object categorization: A critical survey. Comput Vis Image Underst 114:712–722
Ganesan, Balasubramanian A (2019) Indoor versus outdoor scene recognition for navigation of a micro aerial vehicle using spatial color gist wavelet descriptors. Visual Computing for Industry, Biomedicine, and Art 2
He X, Deng L (2017) Deep learning for image-to-text generation: A technical overview. IEEE Signal Process Mag 34:109–116
Hotz L, Neumann B (2005) Scene Interpretation as a Configuration Task. KI, vol 19
Hotz L, Neumann B (2005) Scene Interpretation as a Configuration Task. KI 19
Hu A et al (2020) Probabilistic future prediction for video scene understanding. European Conference on Computer Vision. Springer, Cham
Hwang SJ, Grauman K (2011) Reading between the lines: object localization using implicit cues from image tags. IEEE Trans Pattern Anal Mach Intell 34:1145–1158
Hwang SJ, Grauman K (2012) Learning the relative importance of objects from tagged images for retrieval and cross-modal search. Int J Comput Vis 100:134–153
Isola P, Xiao J, Torralba A, Oliva A (2011) What makes an image memorable? CVPR 2011
Isola P, Xiao J, Parikh D, Torralba A, Oliva A (2013) What makes a photograph memorable? IEEE Trans Pattern Anal Mach Intell 36:1469–1482
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20:1254–1259
Karaoglu S, Tao R, Gevers T, Smeulders AWM (2016) Words matter: scene text for image classification and retrieval. IEEE Trans Multimed 19:1063–1076
Kojima R, Sugiyama O, Nakadai K (2016) Multimodal scene understanding framework and its application to cooking recognition. Appl Artif Intell 30:181–200
Li L, Sumanaphan S (2011) Indoor scene recognition. Stanford University
Li L-J, Su H, Lim Y, Fei-Fei L (2014) Object bank: an object-level image representation for high-level visual recognition. Int J Comput Vis 107:20–39
Li E, Xia J, Du P, Lin C, Samat A (2017) Integrating multilayer features of convolutional neural networks for remote sensing scene classification. IEEE Trans Geosci Remote Sens 55:5653–5665
Li Y, Zhang Z, Cheng Y, Wang L, Tan T (2019) MAPNet: multi-modal attentive pooling network for RGB-D indoor scene classification. Pattern Recogn 90:436–449
Liu B-D, Meng J, Xie W-Y, Shao S, Li Y, Wang Y (2019) Weighted spatial pyramid matching collaborative representation for remote-sensing-image scene classification. Remote Sensing 11
Lu M, Xu RY, Wang Z (2020) Understanding and predicting the memorability of outdoor natural scenes. IEEE Trans Image Process 29:4927–4941
Mary NAB, Dharma D (2017) Coral reef image classification employing improved LDP for feature extraction. Journal of Visual Communication and Image Representation 49:225–242
Mary N A B, Singh AR, Athisayamani S (2021) Classification of Banana Leaf Diseases Using Enhanced Gabor Feature Descriptor. Inventive Communication and Computational Technologies. Springer, Singapore, pp 229-242
Oliva, Torralba A (2001) Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision 42:145–175
Pandey M, Lazebnik S (2011) Scene recognition and weakly supervised object localization with deformable part-based models. 2011 international conference on computer vision
Pangercic D et al (2009) Visual scene detection and interpretation using encyclopedic knowledge and formal description logic. Proceedings of the International Conference on Advanced Robotics (ICAR), vol 11
Patel TA, Dabhi VK, Prajapati HB (2020) Survey on scene classification techniques. 2020 6th international conference on advanced computing and communication systems (ICACCS)
Peng Z Li J Zhang Y Li G-JQ, Tang J (2019) Few-shot image recognition with knowledge transfer. Proceedings of the IEEE International Conference on Computer Vision
Perera S, Tal A, Zelnik-Manor L (2019) Is image memorability prediction solved?. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops
Pham L, McLoughlin I, Phan H , Palaniappan R, Lang Y (2019) Bag-of-features models based on C-DNN network for acoustic scene classification. Audio engineering society conference: 2019 AES international conference on audio forensics
Quattoni A, Torralba A (2009) Recognizing indoor scenes. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp 413-420
Rafique A, Jalal A, Ahmed A (2019) Scene understanding and recognition: statistical segmented model using geometrical features and Gaussian Naı̈ve Bayes. IEEE conference on International Conference on Applied and Engineering Mathematics
Rafique AA, Jalal A, Ahmed A (2019) Scene Understanding and Recognition: Statistical Segmented Model using Geometrical Features and Gaussian Naïve Bayes. IEEE conference on International Conference on Applied and Engineering Mathematics, vol 57
Ramesh B, Jian NLZ, Chen L, Xiang C, Gao Z (2019) Scalable scene understanding via saliency consensus. Soft Comput 23:2429–2443
Rangel JC, Cazorla M, Garcia-Varea I, Martinez-Gomez J, Fromont E, Sebban M (2016) Scene classification based on semantic labeling. Adv Robot 30:758–769
Reiter R, Mackworth AK (1989) A logical framework for depiction and image interpretation. Artif Intell 41:125–155
Russell BC, Torralba A, Murphy KP, Freeman WT (2008) LabelMe: a database and web-based tool for image annotation. Int J Comput Vis 77:157–173
Russell C, Torralba A, Murphy KP, Freeman WT (2015) LabelMe, an annotation tool, , available at: http://labelme.csail.mit.edu/Release3.0/
Rust NC, Mehrpour V (2020) Understanding image memorability. Trends Cogn Sci
Sadeghi MA, Farhadi A (2011) Recognition using visual phrases. CVPR 2011
Savchenko V, Rassadin AG (2019) Scene recognition in user preference prediction based on classification of deep embeddings and object detection. International Symposium on Neural Networks
Triantafillou E, Zemel R, Urtasun R (2017) Few-shot learning through an information retrieval lens. Advances in Neural Information Processing Systems
Vinyals O et al (2016) Matching networks for one shot learning. Adv Neural Inf Proces Syst
Wang W et al (2021) Pattern Analysis and Scene Understanding. Interdisciplinary Evolution of the Machine Brain. Springer, Singapore, pp 59-93
Wu R, Ye Z, Liu P, Tang X, Zhao W (2015) Knowledge as action: A cognitive framework for indoor scene classification. 2015 IEEE international conference on image processing (ICIP)
Xia S, Zeng J, Leng L, Fu X (2019) WS-AM: weakly supervised attention map for scene recognition. Electronics 8
Xiao T et al (2018) Unified perceptual parsing for scene understanding. Proceedings of the European Conference on Computer Vision (ECCV)
Xiao J, Hays J, Ehinger KA, Oliva A, A Torralba (2010) Sun database: Large-scale scene recognition from abbey to zoo. In 2010 IEEE computer society conference on computer vision and pattern recognition, pp 3485-3492
Xueqi L (2016) Method of scene image classification based on gist descriptor and CNN. Video Eng 40:7–11
Zeng D, Chen S, Chen B, Li S (2018) Improving remote sensing scene classification by integrating global-context and local-object features. Remote Sensing 10
Zhang M, Zhu M, Zhao X (2020) Recognition of high-risk scenarios in building construction based on image semantics. J Comput Civil Eng 34
Zhang P, Bai Y, Dong W, Bai B, Li Y (2021) Few-shot Classification of Aerial Scene Images via Meta-learning. Remote Sensing 13
Zhao B, Zhong Y, Zhang L, Huang B (2016) The fisher kernel coding framework for high spatial resolution scene classification. Remote Sens 8:157
Zitnick CL, Vedantam R, Parikh D (2014) Adopting abstract images for semantic scene understanding. IEEE Trans Pattern Anal Mach Intell 38:627–638
Acknowledgments
We thank both Mr. Miles Mathew and Mr. Richard Hacken from the University of Brigham Young, in Provo, Utah, USA for giving us a great help in reviewing this paper and suggesting very useful recommendations.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Benrais, L., Baha, N. High level visual scene classification using background knowledge of objects. Multimed Tools Appl 81, 3663–3692 (2022). https://doi.org/10.1007/s11042-021-11701-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-11701-6