Abstract
Deep learning is a recent form of machine learning that depends on structural compositional models that can represent mapping functions that otherwise require an exponentially larger size flat models. The most successful realization of such learning paradigm is deep neural networks that have in recent years achieved state-of-the-art performance in tasks related, in particular, to visual data, audio data, and natural language processing. The main drawback of such methods is that, albeit their superhuman performance, that performance is achieved solely on very specific individual tasks. Intelligence must however, be broader and more general. So the next big research program is to build machines that are smart in a more general sense over multiple tasks and domains. One way to achieve that is through transfer learning. Transfer learning refers to a broad set of techniques, all aimed towards the reuse of knowledge gained from solving some problem towards the solution of some other problem. In this paper we study the effectiveness of known deep architectures in transfer learning in visual tasks. We consider two of the VGG family, namely, VGG16 and VGG19, the Xception architecture, DenseNet121, and finally the ResNet50 architecture. They are already pretrained on the ImageNet dataset, we tune and test their transfer performance on the Caltech-256 image set. Four of these architectures have shown good training/validation performance on the latter dataset. In addition, DenseNet121 and Xception have shown exceptional superior performance over the VGG variants and ResNet50, though thay are an order of magnitude less in size. However, they are an order of magnitude deeper confirming the lasting conjecture that deep compositional architectures are exponentially more representative and expressive than flatter networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abdu-Aguye, M.G., Gomaa, W.: Novel approaches to activity recognition based on vector autoregression and wavelet transforms. In: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 951–954. IEEE (2018)
Abdu-Aguye, M.G., Gomaa, W.: Competitive feature extraction for activity recognition based on wavelet transforms and adaptive pooling. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2019). https://doi.org/10.1109/IJCNN.2019.8852299
Abdu-Aguye, M.G., Gomaa, W.: Robust human activity recognition based on deep metric learning. In: Proceedings of the 16th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, pp. 656–663. INSTICC, SciTePress (2019). https://doi.org/10.5220/0007916806560663
Abdu-Aguye, M.G., Gomaa, W.: Versatl: versatile transfer learning for IMU-based activity recognition using convolutional neural networks. In: Proceedings of the 16th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, pp. 507–516. INSTICC, SciTePress (2019). https://doi.org/10.5220/0007916705070516
Abdu-Aguye, M.G., Gomaa, W., Makihara, Y., Yagi, Y.: On the feasibility of on-body roaming models in human activity recognition. In: Proceedings of the 16th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, pp. 680–690. INSTICC, SciTePress (2019). https://doi.org/10.5220/0007921606800690
Adel, O., Nafea, Y., Hesham, A., Goma, W.: Gait-based person identification using multiple inertial sensors. In: Proceedings of the 17th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, pp. 621–628. INSTICC, SciTePress (2020). https://doi.org/10.5220/0009791506210628
Agarwal, S., Roth, D.: Learning a sparse representation for object detection. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) Computer Vision - ECCV 2002, pp. 113–127. Springer, Heidelberg (2002)
Ashry, S., Elbasiony, R., Gomaa, W.: An LSTM-based descriptor for human activities recognition using IMU sensors. In: Proceedings of the 15th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, pp. 494–501. INSTICC, SciTePress (2018). https://doi.org/10.5220/0006902404940501
Ashry, S., Gomaa, W.: Descriptors for human activity recognition. In: 2019 7th International Japan-Africa Conference on Electronics, Communications, and Computations, (JAC-ECC), pp. 116–119 (2019)
Ashry, S., Gomaa, W., Abdu-Aguye, M.G., El-borae, N.: Improved IMU-based human activity recognition using hierarchical hmm dissimilarity. In: Proceedings of the 17th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, pp. 702–709. INSTICC, SciTePress (2020). https://doi.org/10.5220/0009886607020709
Ashry, S., Ogawa, T., Gomaa, W.: Charm-deep: continuous human activity recognition model based on deep neural network using IMU sensors of smartwatch. IEEE Sensors J. 20(15), 8757–8770 (2020)
Baños, O., Damas, M., Pomares, H., Rojas, I., Tóth, M.A., Amft, O.: A benchmark dataset to evaluate sensor displacement in activity recognition. In: Proceedings of the 2012 ACM Conference on Ubiquitous Computing, pp. 1026–1035. ACM (2012)
Baykal, E., Dogan, H., Ercin, M.E., Ersoz, S., Ekinci, M.: Transfer learning with pre-trained deep convolutional neural networks for serous cell classification. Multimed. Tools Appl. 79(21), 15593–15611 (2020). https://doi.org/10.1007/s11042-019-07821-9
Bengio, Y.: Deep learning of representations for unsupervised and transfer learning. In: Proceedings of ICML Workshop on Unsupervised and Transfer Learning, pp. 17–36 (2012)
Chikhaoui, B., Gouineau, F., Sotir, M.: A CNN based transfer learning model for automatic activity recognition from accelerometer sensors. In: International Conference on Machine Learning and Data Mining in Pattern Recognition, pp. 302–315. Springer (2018)
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800–1807 (2017)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR09 (2009)
Elkholy, A., Hussein, M., Gomaa, W., Damen, D., Saba, E.: Efficient and robust skeleton-based quality assessment and abnormality detection in human action performance. IEEE J. Biomed. Health Inform. 24(1), 280–291 (2019)
Elkholy, A., Hussein, M.E., Gomaa, W., Damen, D., Saba, E.: A general descriptor for detecting abnormal action performance from skeletal data. In: Proceedings of the 39th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC 17). JeJu Island, S. Korea (2017)
Elkholy, A., Makihara, Y., Gomaa, W., Ahad, M.A.R., Yagi, Y.: Unsupervised gei-based gait disorders detection from different views. In: 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 5423–5426 (2019). https://doi.org/10.1109/EMBC.2019.8856294
Olivas, E.S., Guerrero, J.D.M., Martinez-Sober, M., Magdalena-Benedito, J.R., Serrano, L.: Handbook of Research on Machine Learning Applications and Trends - Algorithms, Methods, and Techniques. Information Science Reference, Hershey, PA (2009)
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings, vol. 2, pp.II (2003)
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical Report 7694, Caltech (2007)
Gomaa, W.: Probabilistic approach to human activity recognition from accelerometer data. In: 2019 7th International Japan-Africa Conference on Electronics, Communications, and Computations, (JAC-ECC), pp. 63–66 (2019)
Gomaa, W.: Statistical and time series analysis of accelerometer signals for human activity recognition. In: 2019 14th International Conference on Computer Engineering and Systems (ICCES), pp. 351–356 (2019)
Gomaa, W.: Statistical metric-theoretic approach to activity recognition based on accelerometer data. In: Hassanien, A.E., Shaalan, K., Tolba, M.F. (eds.) Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2019, pp. 537–546. Springer International Publishing, Cham (2020)
Gomaa, W., Elbasiony, R., Ashry, S.: ADL classification based on autocorrelation function of inertial signals. In: 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 833–837 (2017). https://doi.org/10.1109/ICMLA.2017.00-53
Gopalakrishnan, K., Khaitan, S.K., Choudhary, A., Agrawal, A.: Deep convolutional neural networks with transfer learning for computer vision-based data-driven pavement distress detection. Construct. Build. Mater. 157, 322–330 (2017). https://doi.org/10.1016/j.conbuildmat.2017.09.110, http://www.sciencedirect.com/science/article/pii/S0950061817319335
Harley, A.W.: An interactive node-link visualization of convolutional neural networks. In: International Symposium on Visual Computing, pp. 867–877. Springer (2015)
Hasan, M.K., Aleef, T.A.: Automatic mass detection in breast using deep convolutional neural network and SVM classifier. CoRR abs/1907.04424 (2019). http://arxiv.org/abs/1907.04424
Hassanein, A., Hussein, M., Gomaa, W.: Semantic analysis of crowded scenes based on non-parametric tracklet clustering. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence IJCAI-16. New York City, USA (2016)
Hassanein, A.S., Hussein, M.E., Gomaa, W., Makihara, Y., Yagi, Y.: Identifying motion pathways in highly crowded scenes: Aa non-parametric tracklet clustering approach. Comput. Vision Image Understand. 191 (2020). https://doi.org/10.1016/j.cviu.2018.08.004, http://www.sciencedirect.com/science/article/pii/S1077314218301887
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Hu, D.H., Zheng, V.W., Yang, Q.: Cross-domain activity recognition via transfer learning. Pervasive Mobile Comput. 7(3), 344–358 (2011). https://doi.org/10.1016/j.pmcj.2010.11.005, http://www.sciencedirect.com/science/article/pii/S1574119210001227. Knowledge-Driven Activity Recognition in Intelligent Environments
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269 (2017)
Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep Learning. Illustrated edn. Adaptive Computation and Machine Learning series. The MIT Press (2016)
ImageNet: http://www.image-net.org
Jordan, M.I., Mitchell, T.M.: Machine learning: trends, perspectives, and prospects. Science 349(6245), 255–260 (2015)
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)
Khan, M.A.A.H., Roy, N.: Transact: transfer learning enabled activity recognition. In: 2017 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), pp. 545–550 (2017). https://doi.org/10.1109/PERCOMW.2017.7917621
Khan, M.A.A.H., Roy, N.: Untran: recognizing unseen activities with unlabeled data using transfer learning. In: 2018 IEEE/ACM Third International Conference on Internet-of-Things Design and Implementation (IoTDI), pp. 37–47 (2018). https://doi.org/10.1109/IoTDI.2018.00014
Li Fei-Fei, Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In: 2004 Conference on Computer Vision and Pattern Recognition Workshop, pp. 178–178 (2004)
Lu, G., Hao, Q., Kong, K., Yan, J., Li, H., Li, X.: Deep convolutional neural networks with transfer learning for neonatal pain expression recognition. In: 2018 14th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), pp. 251–256 (2018). https://doi.org/10.1109/FSKD.2018.8687129
Masrour, T., El Hassani, I., Bouchama, M.S.: Deep convolutional neural networks with transfer learning for old buildings pathologies automatic detection. In: Ezziyyani, M. (ed.) Advanced Intelligent Systems for Sustainable Development (AI2SD’2019), pp. 204–216. Springer, Cham (2020)
Mostafa., A., Barghash., T.O., Assaf., A.A., Gomaa., W.: Multi-sensor gait analysis for gender recognition. In: Proceedings of the 17th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, pp. 629–636. INSTICC, SciTePress (2020). https://doi.org/10.5220/0009792006290636
Moustafa, A., Hussein, M., Gomaa, W.: Gate and common pathway detection in crowd scenes using motion units and meta-tracking. In: Proceedings of the International Conference on Digital Image Computing: Techniques and Applications (DICTA 2017), Sydney, Australia (2017)
Moustafa, A.N., Gomaa, W.: Gate and common pathway detection in crowd scenes and anomaly detection using motion units and LSTM predictive models. Multimed. Tools Appl. (2020). https://doi.org/10.1007/s11042-020-08840-7
Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1717–1724 (2014)
Pan, S.J., Yang, Q., et al.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010). https://doi.org/10.1109/TKDE.2009.191
Ponce, J., Berg, T.L., Everingham, M., Forsyth, D.A., Hebert, M., Lazebnik, S., Marszalek, M., Schmid, C., Russell, B.C., Torralba, A., Williams, C.K.I., Zhang, J., Zisserman, A.: Dataset Issues in Object Recognition, pp. 29–48. Springer, Heidelberg (2006)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, Conference Track Proceedings, San Diego, CA, USA, 7–9 May 2015 (2015). URL http://arxiv.org/abs/1409.1556
Su, Y., Chiu, T., Yeh, C., Huang, H., Hsu, W.H.: Transfer learning for video recognition with scarce training data. CoRR abs/1409.4127 (2014). http://arxiv.org/abs/1409.4127
Thrun, S., Pratt, L.: Learning to Learn: Introduction and Overview, pp. 3–17. Springer, Boston (1998)
Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Ghahramani, Z., Welling,M., Cortes, C., Lawrence, N., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 3320–3328. Curran Associates, Inc. (2014). https://proceedings.neurips.cc/paper/2014/file/375c71349b295fbe2dcdca9206f20a06-Paper.pdf
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: European Conference on Computer Vision, pp. 818–833. Springer (2014)
Acknowledgements
This work is Funded by the Science and Technology Development Fund STDF (Egypt); Project id: 42519 - “Automatic Video Surveillance System for Crowd Scenes”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Gomaa, W. (2021). Deep Architectures in Visual Transfer Learning. In: Ahad, M.A.R., Inoue, A. (eds) Vision, Sensing and Analytics: Integrative Approaches. Intelligent Systems Reference Library, vol 207. Springer, Cham. https://doi.org/10.1007/978-3-030-75490-7_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-75490-7_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75489-1
Online ISBN: 978-3-030-75490-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)