Influence of Different Activation Functions on Deep Learning Models in Indoor Scene Images Classification

Anami, Basavaraj S.; Sagarnal, Chetan V.

doi:10.1134/S1054661821040039

Influence of Different Activation Functions on Deep Learning Models in Indoor Scene Images Classification

APPLICATION PROBLEMS
Published: 18 March 2022

Volume 32, pages 78–88, (2022)
Cite this article

Pattern Recognition and Image Analysis Aims and scope Submit manuscript

Basavaraj S. Anami¹ &
Chetan V. Sagarnal²

187 Accesses
3 Citations
Explore all metrics

Abstract

The success of deep learning in the field of computer vision and object recognition has made significant breakthroughs, especially in improving recognition accuracy. The scene recognition algorithms have been evolved over the years because of the developments in machine learning and deep convolution neural networks (DCNNs). In this paper, the classification of indoor scenes using three deep learning models, namely, ResNet, MobileNet, and EfficientNet is attempted. The influence of activation functions on classification accuracy is being explored. Three activation functions, namely, tanh, ReLU, and sigmoid, are deployed in the work. The MIT-67 indoor dataset is split into scenes with and without people to test its effect on the accuracy of classification. The novelty of the work includes splitting the dataset, based on the spatial layout and segregating, into two groups, namely, with people and without people. Amongst the three pre-trained models, EfficientNet has given good results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Indoor Image Recognition and Classification via Deep Convolutional Neural Network

Classification of Indoor–Outdoor Scene Using Deep Learning Techniques

Indoor–Outdoor Scene Classification with Residual Convolutional Neural Network

REFERENCES

L. Deng and D. Yu, “Deep learning: Methods and applications,” Found. Trends Signal Process. 7, 197–387 (2014). https://doi.org/10.1561/2000000039
Article MathSciNet MATH Google Scholar
J. Deng, W. Dong, R. Socher, L. Li, K. Li, and F. Li, “ImageNet: A large-scale hierarchical image database,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Miami, 2009 (IEEE, 2009), pp. 248–255. https://doi.org/10.1109/CVPR.2009.5206848
N. M. Elfiky, F. S. Khan, J. van de Weijer, and J. Gonzàlez, “Discriminative compact pyramids for object and scene recognition,” Pattern Recognit. 45, 1627–1636 (2012). https://doi.org/10.1016/j.patcog.2011.09.020
Article MATH Google Scholar
H. Goh, N. Thome, M. Cord and J.-H. Lim, “Learning deep hierarchical visual feature coding,” IEEE Trans. Neural Network. Learn. Syst. 25, 2212–2225 (2014). https://doi.org/10.1109/TNNLS.2014.2307532
Article Google Scholar
S. Guo, W. Huang, L. Wang, and Y. Qiao, “Locally supervised deep hybrid model for scene recognition,” IEEE Trans. Image Process. 26, 808–820 (2017). https://doi.org/10.1109/TIP.2016.2629443
Article MathSciNet MATH Google Scholar
M. Hayat, S. H. Khan, M. Bennamoun, and S. An, “A spatial layout and scale invariant feature representation for indoor scene classification,” IEEE Trans. Image Process. 25, 4829–4841 (2016). https://doi.org/10.1109/TIP.2016.2599292
Article MathSciNet MATH Google Scholar
L. Herranz, S. Jiang, and X. Li, “Scene recognition with CNNs: Objects, scales and dataset bias,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, 2016 (IEEE, 2016), pp. 571–579. https://doi.org/10.1109/CVPR.2016.68
A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient convolutional neural networks for mobile vision applications.” arXiv:1704.04861 [cs.CV]
S. Lazebnik, C. Schmid, and J. Ponce, “Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories,” in IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, New York, 2006 (IEEE, 2006), pp. 2169–2178. https://doi.org/10.1109/CVPR.2006.68
G. L. Malcolm, I. I. A. Groen, and C. I. Baker, “Making sense of real-world scenes,” Trends Cognit. Sci. 20, 843–856 (2016). https://doi.org/10.1016/j.tics.2016.09.003
Article Google Scholar
X. Meng, Z. Wang, and L. Wu, “Building global image features for scene recognition,” Pattern Recognit. 45, 373–380 (2012). https://doi.org/10.1016/j.patcog.2011.06.012
Article Google Scholar
M. Tan and Q. Le, “EfficientNet: Rethinking model scaling for convolutional neural networks,” in Proc. 36th Int. Conf. on Machine Learning, Ed. by K. Chaudhuri and R. Salakhutdinov, Proc. of Machine Learning Research, vol. 97 (PMLR, 2019), pp. 6105–6114. http://proceedings.mlr.press/v97/tan19a/tan19a.pdf
A. Quattoni and A. Torralba, “Recognizing indoor scenes,” in IEEE Conf. on Computer Vision and Pattern Recognition, Miami, 2009 (IEEE, 2009), pp. 413–420. https://doi.org/10.1109/CVPR.2009.5206537
H. Seong, J. Hyun, and E. Kim, “FOSNet: An end-to-end trainable deep neural network for scene recognition,” IEEE Access 8, 82066–82077 (2020). https://doi.org/10.1109/ACCESS.2020.2989863
Article Google Scholar
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proc. Int. Conf. on Learning Representations, 2015. arXiv:1409.1556 [cs.CV]
K. Wu, E. Wu, and G. Kreiman, “Learning scene gist with convolutional neural networks to improve object recognition,” 52nd Ann. Conf. on Information Sciences and Systems, Princeton, N.J., 2018 (IEEE, 2018), pp. 1–6. https://doi.org/10.1109/CISS.2018.8362305
J. Wu and J. M. Rehg, “CENTRIST: A visual descriptor for scene categorization,” IEEE Trans. Pattern Anal. Mach. Intell. 33, 1489–1501 (2011). https://doi.org/10.1109/TPAMI.2010.224
Article Google Scholar
J. Xiao, J. Hays, K.A. Ehinger, A. Oliva, and A. Torralba, “SUN database: Large-scale scene recognition from abbey to zoo,” in IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, San Francisco, 2010 (IEEE, 2010), pp. 3485–3492. https://doi.org/10.1109/CVPR.2010.5539970
G.-S. Xie, X.-Y. Zhang, S. Yan, and C.-L. Liu, “Hybrid CNN and dictionary-based models for scene recognition and domain adaptation,” IEEE Trans. Circuits Syst. Video Technol. 27, 1263–1274 (2017). https://doi.org/10.1109/TCSVT.2015.2511543
Article Google Scholar
G.-S. Xie, X.-Y. Zhang, and C.-L. Liu, “Efficient feature coding based on auto-encoder network for image classification,” in Computer Vision–ACCV 2014, Ed. by D. Cremers, I. Reid, H. Saito, and M. H. Yang, Lecture Notes in Computer Science, vol. 9003 (Springer, Cham, 2015), pp. 628–642. https://doi.org/10.1007/978-3-319-16865-4_41
Book Google Scholar
L. Xie, F. Lee, L. Liu, Z. Yin, Y. Yan, W. Wang, J. Zhao, and Q. Chen, “Improved spatial pyramid matching for scene recognition,” Pattern Recognit. 82 (2018), 118–129. https://doi.org/10.1016/j.patcog.2018.04.025
Article Google Scholar
Y. Liu, Q. Chen, W. Chen, and I. Wassell, “Dictionary learning inspired deep network for scene recognition,” Proc. AAAI Conf. Artif. Intell. 32, 7178–7185 (2018). https://doi.org/ojs.aaai.org/index.php/AAAI/article/view/12312
S. Yang and D. Ramanan, “Multi-scale recognition with DAG-CNNs,” in IEEE Int. Conf. on Computer Vision (ICCV), Santiago, Chile, 2016 (IEEE, 2016), pp. 1215–1223. https://doi.org/10.1109/ICCV.2015.144
Z. Wu, Ch. Shen, and A. van den Hengel, “Wider or deeper: Revisiting the ResNet model for visual recognition,” Pattern Recognit. 90, 119–133 (2019). https://doi.org/10.1016/j.patcog.2019.01.006
Article Google Scholar

Download references

Funding

The authors declare that there is no external funding received; however, the necessary hardware and software support is received from the research center of the institute for carrying out the research. The authors thank Visvesvaraya Technological University (VTU), Belagavi-590018, and KLE Institute of Technology, Hubballi-580027, India for providing a platform to carrying out the research work.

Author information

Authors and Affiliations

Department of CSE, K. L. E. Institute of Technology, 580030, Hubballi, Karnataka, India
Basavaraj S. Anami
Department of ECE, K. L. E. Institute of Technology, 580030, Hubballi, Karnataka, India
Chetan V. Sagarnal

Authors

Basavaraj S. Anami
View author publications
You can also search for this author in PubMed Google Scholar
Chetan V. Sagarnal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Basavaraj S. Anami or Chetan V. Sagarnal.

Ethics declarations

COMPLIANCE WITH ETHICAL STANDARDS

This article is a completely original work of its authors; it has not been published before and will not be sent to other publications until the PRIA Editorial Board decides not to accept it for publication.

Conflict of Interest

The process of writing and the content of the article do not give grounds for raising the issue of a conflict of interest.

Additional information

Basavaraj S. Anami is working as Principal of K. L. E. Institute of Technology, Hubballi-580027, India. He is a veteran professor in computer science and has more than 40 yr of teaching experience, including 20 yr of research experience. His research area includes agriculture/horticulture image processing and natural language processing. A senior member of IEEE and CSI. He has authored three books in computer science, published by PHI, Wiely (India) and UP.

Chetan V. Sagarnal received a Bachelor of Engineering and Master of Technology in Electronics and Communication Engineering from Visveswaraiah Technological University (VTU), Belagavi-590018, India. He has 7 years of teaching experience and pursuing his PhD in Computer Science and Engineering from VTU. His current research interests include computer vision and deep learning.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Anami, B.S., Sagarnal, C.V. Influence of Different Activation Functions on Deep Learning Models in Indoor Scene Images Classification. Pattern Recognit. Image Anal. 32, 78–88 (2022). https://doi.org/10.1134/S1054661821040039

Download citation

Received: 30 November 2020
Revised: 15 July 2021
Accepted: 20 July 2021
Published: 18 March 2022
Issue Date: March 2022
DOI: https://doi.org/10.1134/S1054661821040039

Keywords:

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions