Skip to main content
Log in

Improved transfer learning of CNN through fine-tuning and classifier ensemble for scene classification

  • Data analytics and machine learning
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

In high-resolution remote sensing imageries, the scene classification is one of the challenging problems due to the similarity of image structure and available datasets are all small. Performing training with small datasets on new convolutional neural network (CNN) is inclined to overfitting, and attainability is poor. To overcome this, we go for a stream of transfer learning, fine-tuning strategy. Here, we consider AlexNet, VGG 19, and VGG 16 pre-trained CNNs. First, design a network by replacing the classifier stage layers with revised ones through transfer learning. Second, apply fine-tuning from right to left and perform retraining on the classifier stage and part of the feature extraction stage (last convolutional block). Third, form a classifier ensemble by using the majority voting learner strategy to explore better classification results. The datasets called UCM and SIRI-WHU were used and compared with the state-of-the-art methods. Finally, to check the usefulness of our proposed methods, form sub-datasets from AID and WHU-RS19 datasets with likely labeled class names. To assess the performance of the proposed classifiers compute overall accuracy using confusion matrix and F1-score. The results of the proposed methods improve the accuracy from 93.57 to 99.04% for UCM and 91.34 to 99.16% for SIRI-WHU.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Data availability

The datasets used in this article are publicly available and corresponding hyperlinks mentioned in the references.

References

  • Aerial Image Dataset(AID) get from (2021) https://captain-whu.github.io/AID/

  • Bahmanyar R, Cui S, Datcu M (2015) A comparative study of bag-of-words and bag-of-topics models of EO image patches. IEEE Geosci Remote Sens Lett 12(6):1357–1361

    Article  Google Scholar 

  • Basha SS, Vinakota SK, Pulabaigari V, Mukherjee S, Dubey SR (2021) Autotune: automatically tuning convolutional neural networks for improved transfer learning. Neural Netw 133:112–122

    Article  Google Scholar 

  • Boland PJ (2020) Majority systems and the Condorcet jury theorem. J R Stat Soc 38(3):181–189

    Google Scholar 

  • Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Procedings of the conference on computer vision and pattern recognition, San Diego, CA, USA, pp 886–893

  • Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) “DeCAF: a deep convolutional activation feature for generic visual recognition. In: Proceedings of the 31st international conference on machine learning, Beijing, China, pp 647–655

  • Gao Z, Xie J, Wang Q, Li P (2019) Global second-order pooling convolutional networks. In: Proceedings of the IEEE conference on computer vision pattern recognition, Long Beach, CA, USA, pp 3024–3033

  • Hinton GE, Osindero S, The YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

    Article  MathSciNet  Google Scholar 

  • Hu F, Xia G, Hu J, Zhang L (2015) Transferring deep convolutional neural networks for the scene classification of high-resolution remote sensing imagery. Remote Sens 7(11):14680–14707

    Article  Google Scholar 

  • Karlos S, Kostopoulos G, Kotsiantis S (2020) A soft-voting ensemble based co-training scheme using static selection for binary classification problems. Algorithms 13(1):26

    Article  Google Scholar 

  • Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  • Kuncheva LI (2014) Combining pattern classifiers: methods and algorithms. Wiley, New York

    MATH  Google Scholar 

  • Lazebnik S, Schmid C, Ponce J (2009) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), vol 2, pp 2169–2178

  • Li X, Wang Q, Liu S, Chanussot J (2018) Scene classification with recurrent attention of VHR remote sensing images. IEEE Trans Geosci Remote Sens 57(2):1155–1167

    Google Scholar 

  • Li L, Liang P, Ma J, Jiao L, Guo X, Liu F, Sun C (2020) A multiscale self-adaptive attention network for remote sensing scene classification. Remote Sens 12(14):1527–1554

    Google Scholar 

  • Lienou M, Maitre H, Datcu M (2010) Semantic annotation of satellite images using latent dirichlet allocation. IEEE Geosci Remote Sens Lett 7(1):28–32

    Article  Google Scholar 

  • Liu Y, Zhong Y, Fei F, Zhu Q, Qin Q (2018) Scene classification based on a deep random-scale stretched convolutional neural network. Remote Sens 10(3):444

    Article  Google Scholar 

  • Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput vis 60(2):91–110

    Article  Google Scholar 

  • Lv Y, Zhang X, Xiong W, Cui Y, Cai M (2019) An end-to-end local-global-fusion feature extraction network for remote sensing image scene classification. Remote Sens 11(24):3006

    Article  Google Scholar 

  • Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24:971–987

    Article  Google Scholar 

  • Sheng G, Yang W, Xu T, Sun H (2012) High-resolution satellite scene classification using a sparse coding based multiple feature combination. Int J Remote Sens 33(8):2395–2412

    Article  Google Scholar 

  • Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of the international conference learning representation (ICLR), San Diego, CA, USA, pp 1–14

  • SIRI-WHU Data Set (2020). http://www.lmars.whu.edu.cn/prof_web/zhongyanfei/Num/Google.html

  • Szegedy C, Liu W, Jia Y, Vanhoucke V (2015) Going deeper with convolutions. In: Proceedings IEEE conference computer vision pattern recognition (CVPR), pp 1–9

  • UC Merced Data Set (2019). http://weegee.vision.ucmerced.edu/datasets/landuse.html

  • Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11(12):3371–3408

    MathSciNet  MATH  Google Scholar 

  • Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2019) ECA-Net: efficient channel attention for deep convolutional neural networks. arXiv:1910.03151

  • WHU-RS 19 Dataset (2021) http://dsp.whu.edu.cn/cn/staff/yw/HRSscene.html, http://captain.whu.edu.cn/datasets/WHU-RS19.zip

  • Wu J, Cai Z (2011) Attribute weighting via differential evolution algorithm for attribute weighted naive bayes (wnb). J Comput Inf Syst 7(4):1672–1679

    Google Scholar 

  • Wu F, Wang C, Zhang B, Zhang H, Gong L (2019) Discrimination of collapsed buildings from remote sensing imagery using deep neural networks. In: IGARSS IEEE geoscience remote sensing symposium, pp 2646–2649

  • Wu Q, Wu B, Hu C, Yan X (2021) Evolutionary multilabel classification algorithm based on cultural algorithm. Symmetry 13(2):322

    Article  Google Scholar 

  • Xia G-S, Hu J, Hu F, Shi B, Bai X, Zhong Y, Zhang L, Lu X (2017) Aid: A benchmark data set for performance evaluation of aerial scene classification. IEEE Trans Geosci Remote Sens 55(7):3965–3981

    Article  Google Scholar 

  • Yang Y, Newsam S (2010) Bag-of-visual-words and spatial extensions for land-use classification. In: Proceedings of the international conference on ACM SIGSPATIAL GIS, San Jose, CA, USA, pp 270–279

  • Zeiler MD, Fergus R (2019) Visualizing and understanding convolutional networks. In: Proceedings of the European conference on computer vision. Springer, Cham, pp 818–833

  • Zhang F, Du B, Zhang L (2015) Scene classification via a gradient boosting random convolutional network framework. IEEE Trans Geosci Remote Sens 54(3):1793–1802

    Article  Google Scholar 

  • Zhao B, Zhong Y, Xia G-S, Zhang L (2016) Dirichlet-derived multiple topic scene classification model for high spatial resolution remote sensing imagery. IEEE Trans Geosci Remote Sens 54(4):2108–2123

    Article  Google Scholar 

  • Zhou W, Newsam S, Li C, Shao Z (2018) PatternNet: a benchmark dataset for performance evaluation of remote sensing image retrieval. ISPRS J Photogram Remote Sens 145:197–209

    Article  Google Scholar 

Download references

Funding

The authors received no specific funding for this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Thirumaladevi.

Ethics declarations

Conflict of interest

Authors don’t have any conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Thirumaladevi, S., Veera Swamy, K. & Sailaja, M. Improved transfer learning of CNN through fine-tuning and classifier ensemble for scene classification. Soft Comput 26, 5617–5636 (2022). https://doi.org/10.1007/s00500-022-07145-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-022-07145-1

Keywords

Navigation