AutoFCL: automatically tuning fully connected layers for handling small dataset

Basha, S. H. Shabbeer; Vinakota, Sravan Kumar; Dubey, Shiv Ram; Pulabaigari, Viswanath; Mukherjee, Snehasis

doi:10.1007/s00521-020-05549-4

AutoFCL: automatically tuning fully connected layers for handling small dataset

Original Article
Published: 04 January 2021

Volume 33, pages 8055–8065, (2021)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

S. H. Shabbeer Basha ORCID: orcid.org/0000-0002-8590-0897¹,
Sravan Kumar Vinakota¹,
Shiv Ram Dubey¹,
Viswanath Pulabaigari¹ &
…
Snehasis Mukherjee¹

500 Accesses
17 Citations
2 Altmetric
Explore all metrics

Abstract

Deep convolutional neural networks (CNN) have evolved as popular machine learning models for image classification during the past few years, due to their ability to learn the problem-specific features directly from the input images. The success of deep learning models solicits architecture engineering rather than hand-engineering the features. However, designing state-of-the-art CNN for a given task remains a non-trivial and challenging task, especially when training data size is less. To address this phenomena, transfer learning has been used as a popularly adopted technique. While transferring the learned knowledge from one task to another, fine-tuning with the target-dependent fully connected (FC) layers generally produces better results over the target task. In this paper, the proposed AutoFCL model attempts to learn the structure of FC layers of a CNN automatically using Bayesian Optimization. To evaluate the performance of the proposed AutoFCL, we utilize five pre-trained CNN models such as VGG-16, ResNet, DenseNet, MobileNet, and NASNetMobile. The experiments are conducted on three benchmark datasets, namely CalTech-101, Oxford-102 Flowers, and UC Merced Land Use datasets. Fine-tuning the newly learned (target-dependent) FC layers leads to state-of-the-art performance, according to the experiments carried out in this research. The proposed AutoFCL method outperforms the existing methods over CalTech-101 and Oxford-102 Flowers datasets by achieving the accuracy of \(94.38\%\) and \(98.89\%\), respectively. However, our method achieves comparable performance on the UC Merced Land Use dataset with \(96.83\%\) accuracy. The source code of this research is available at https://github.com/shabbeersh/AutoFCL.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convolutional Neural Network Ensemble Fine-Tuning for Extended Transfer Learning

A Comprehensive Study on Deep Image Classification with Small Datasets

Supervised Greedy Layer-Wise Training for Deep Convolutional Networks with Small Datasets

References

Hinton GE, Krizhevsky A, Sutskever I (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1106–1114
Google Scholar
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Hinton G, Deng L, Yu D, Dahl G, Mohamed A-r, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Sainath T, Kingsbury B (2012) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process Mag 29(6):82–97
Article Google Scholar
Wang M, Abdelfattah S, Moustafa N, Hu J (2018) Deep gaussian mixture-hidden markov model for classification of eeg signals. IEEE Trans Emerg Top Comput Intell 2(4):278–287
Article Google Scholar
Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8697–8710
Liu C, Zoph B, Neumann M, Shlens J, Hua W, Li L-J, Fei-Fei L, Yuille A, Huang J, Murphy K (2018) Progressive neural architecture search. In: Proceedings of the European conference on computer vision (ECCV), pp 19–34
Elsken T, Metzen JH, Hutter F (2018) Neural architecture search: a survey. arXiv preprint arXiv:1808.05377
Jaafra Y, Laurent JL, Deruyver A, Naceur MS (2019) Reinforcement learning for neural architecture search: a review. Image Vis Comput 89:57–66
Article Google Scholar
Basha SHS, Dubey SR, Pulabaigari V, Mukherjee S (2019) Impact of fully connected layers on performance of convolutional neural networks for image classification. Neurocomputing 378:112–119
Article Google Scholar
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: Computer vision and pattern recognition (CVPR) 2009. IEEE Conference on IEEE, pp 248–255
Zeiler MD, and Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision, Springer, pp 818–833
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Xu Q, Zhang M, Gu Z, Pan G (2019) Overfitting remedy by sparsifying regularization on fully-connected layers of cnns. Neurocomputing 328:69–74
Article Google Scholar
Mendoza H, Klein A, Feurer M, Springenberg JT, Hutter F (2016) Towards automatically-tuned neural networks. In: Workshop on automatic machine learning, pp 58–65
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861
Ng H-W, Nguyen VD, Vonikakis V, Winkler S (2015) Deep learning for emotion recognition on small datasets using transfer learning. In: Proceedings of the 2015 ACM on international conference on multimodal interaction. ACM, pp 443–449
Frazier PI (2018) A tutorial on bayesian optimization. arXiv preprint arXiv:1807.02811
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis (IJCV) 115(3):211–252
Article MathSciNet Google Scholar
Li X, Grandvalet Y, Davoine F, Cheng J, Cui Y, Zhang H, Belongie S, Tsai Y-H, Yang M-H (2020) Transfer learning in computer vision tasks: remember where you come from. Image Vis Comput 93:103853
Article Google Scholar
Hu J (2017) Discriminative transfer learning with sparsity regularization for single-sample face recognition. Image Vis Comput 60:48–57
Article Google Scholar
Han D, Liu Q, Fan W (2018) A new image classification method using cnn transfer learning and web data augmentation. Expert Syst Appl 95:43–56
Article Google Scholar
Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12(Jul):2121–2159
MathSciNet MATH Google Scholar
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Wistuba M (2017) Bayesian optimization combined with successive halving for neural network architecture optimization. In: AutoML@ PKDD/ECML , pp 2–11
Ji D, Jiang Y, Qian P, Wang S (2019) A novel doubly reweighting multisource transfer learning framework. IEEE Trans Emerg Top Comput Intell 3(5):380–391
Article Google Scholar
Gupta A, Ong Y-S, Feng L (2017) Insights on transfer optimization: because experience is the best teacher. IEEE Trans Emerg Top Comput Intell 2(1):51–64
Article Google Scholar
Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks?. In: Advances in neural information processing systems, pp 3320–3328
Xie M, Jean N, Burke M, Lobell D, Ermon S (2016) Transfer learning from deep features for remote sensing and poverty mapping. In: 13th AAAI conference on artificial intelligence
Molchanov P, Tyree S, Karras T, Aila T, Kautz J (2016) Pruning convolutional neural networks for resource efficient transfer learning, vol 3. arXiv preprint arXiv:1611.06440
Snoek J, Larochelle H, Adams RP (2012) Practical Bayesian optimization of machine learning algorithms. In: Advances in neural information processing systems, pp 2951–2959
Williams CK, Rasmussen CE (2006) Gaussian processes for machine learning, vol 2. MIT press, Cambridge, MA
MATH Google Scholar
Rasmussen CE (2003) Gaussian processes in machine learning. In: Summer school on machine learning. Springer, Berlin, Heidelberg, pp 63–71
Google Scholar
Jones DR, Schonlau M, Welch WJ (1998) Efficient global optimization of expensive black-box functions. J Glob Optim 13(4):455–492
Article MathSciNet Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034
Kelley HJ (1960) Gradient theory of optimal flight paths. Ars J 30(10):947–954
Article Google Scholar
Fei-Fei L, Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell 28(4):594–611
Article Google Scholar
Nilsback M-E, Zisserman A (2008) Automated flower classification over a large number of classes. In: Proceedings of the Indian conference on computer vision, graphics and image processing, Dec
Yang Y, Newsam S (2010) Bag-of-visual-words and spatial extensions for land-use classification. In: Proceedings of the 18th SIGSPATIAL international conference on advances in geographic information systems. ACM, pp 270–279
Lee H, Grosse R, Ranganath R, Ng AY (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 609–616
Cubuk ED, Zoph B, Mane D, Vasudevan V, and Le QV (2019) Autoaugment: learning augmentation strategies from data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 113–123
Sawada Y, Sato Y, Nakada T, Yamaguchi S, Ujimoto K, Hayashi N (2019) Improvement in classification performance based on target vector modification for all-transfer deep learning. Appl Sci 9(1):128
Article Google Scholar
Huang B, Hu Y, Sun Y, Hao X, Yan C (2018) A flower classification framework based on ensemble of CNNS. In: Pacific Rim Conference on Multimedia, Springer, pp 235–244
Lv X, Duan F (2018) Metric learning via feature weighting for scalable image retrieval. Pattern Recognit Lett 109:97–102
Article Google Scholar
Murabito F, Spampinato C, Palazzo S, Giordano D, Pogorelov K, Riegler M (2018) Top-down saliency detection driven by visual classification. Comput Vis Image Underst 172:67–76
Article Google Scholar
Simon M, Rodner E, Darrell T, Denzler J (2018) The whole is more than its parts? From explicit to implicit pose normalization. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2018.2885764
Article Google Scholar
Karlinsky L, Shtok J, Harary S, Schwartz E, Aides A, Feris R, Giryes R, Bronstein AM (2019) Repmet: representative-based metric learning for classification and few-shot object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5197–5206
Shao W, Yang W, Xia G-S, Liu G (2013) A hierarchical scheme of multiple feature fusion for high-resolution satellite scene categorization. In: International conference on computer vision systems, Springer, pp 324–333
Yang MY, Al-Shaikhli S, Jiang T, Cao Y, Rosenhahn B (2016) Bi-layer dictionary learning for remote sensing image classification. In: IEEE International geoscience and remote sensing symposium (IGARSS), pp 3059–3062
Akram T, Laurent B, Naqvi SR, Alex MM, Muhammad N et al (2018) A deep heterogeneous feature fusion approach for automatic land-use classification. Inf Sci 467:199–218
Article Google Scholar
Wang EK, Li Y, Nie Z, Yu J, Liang Z, Zhang X, Yiu SM (2019) Deep fusion feature based object detection method for high resolution optical remote sensing images. Appl Sci 9(6):1130
Article Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images

Download references

Acknowledgements

We appreciate NVIDIA Corporation’s support with the donation of GeForce Titan XP GPU (Grant No. GPU-900-1G611-2500-000T), which is used for this research.

Author information

Authors and Affiliations

Computer Vision and Machine Learning Groups, Indian Institute of Information Technology Sri City, Chittoor, Andhra Pradesh, 517646, India
S. H. Shabbeer Basha, Sravan Kumar Vinakota, Shiv Ram Dubey, Viswanath Pulabaigari & Snehasis Mukherjee

Authors

S. H. Shabbeer Basha
View author publications
You can also search for this author in PubMed Google Scholar
Sravan Kumar Vinakota
View author publications
You can also search for this author in PubMed Google Scholar
Shiv Ram Dubey
View author publications
You can also search for this author in PubMed Google Scholar
Viswanath Pulabaigari
View author publications
You can also search for this author in PubMed Google Scholar
Snehasis Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. H. Shabbeer Basha.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Basha, S.H.S., Vinakota, S.K., Dubey, S.R. et al. AutoFCL: automatically tuning fully connected layers for handling small dataset. Neural Comput & Applic 33, 8055–8065 (2021). https://doi.org/10.1007/s00521-020-05549-4

Download citation

Received: 02 April 2020
Accepted: 18 November 2020
Published: 04 January 2021
Issue Date: July 2021
DOI: https://doi.org/10.1007/s00521-020-05549-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

AutoFCL: automatically tuning fully connected layers for handling small dataset

Abstract

Access this article

Similar content being viewed by others

Convolutional Neural Network Ensemble Fine-Tuning for Extended Transfer Learning

A Comprehensive Study on Deep Image Classification with Small Datasets

Supervised Greedy Layer-Wise Training for Deep Convolutional Networks with Small Datasets

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

AutoFCL: automatically tuning fully connected layers for handling small dataset

Abstract

Access this article

Similar content being viewed by others

Convolutional Neural Network Ensemble Fine-Tuning for Extended Transfer Learning

A Comprehensive Study on Deep Image Classification with Small Datasets

Supervised Greedy Layer-Wise Training for Deep Convolutional Networks with Small Datasets

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation