Skip to main content

Deep Architectures in Visual Transfer Learning

  • Chapter
  • First Online:
Vision, Sensing and Analytics: Integrative Approaches

Part of the book series: Intelligent Systems Reference Library ((ISRL,volume 207))

  • 821 Accesses

Abstract

Deep learning is a recent form of machine learning that depends on structural compositional models that can represent mapping functions that otherwise require an exponentially larger size flat models. The most successful realization of such learning paradigm is deep neural networks that have in recent years achieved state-of-the-art performance in tasks related, in particular, to visual data, audio data, and natural language processing. The main drawback of such methods is that, albeit their superhuman performance, that performance is achieved solely on very specific individual tasks. Intelligence must however, be broader and more general. So the next big research program is to build machines that are smart in a more general sense over multiple tasks and domains. One way to achieve that is through transfer learning. Transfer learning refers to a broad set of techniques, all aimed towards the reuse of knowledge gained from solving some problem towards the solution of some other problem. In this paper we study the effectiveness of known deep architectures in transfer learning in visual tasks. We consider two of the VGG family, namely, VGG16 and VGG19, the Xception architecture, DenseNet121, and finally the ResNet50 architecture. They are already pretrained on the ImageNet dataset, we tune and test their transfer performance on the Caltech-256 image set. Four of these architectures have shown good training/validation performance on the latter dataset. In addition, DenseNet121 and Xception have shown exceptional superior performance over the VGG variants and ResNet50, though thay are an order of magnitude less in size. However, they are an order of magnitude deeper confirming the lasting conjecture that deep compositional architectures are exponentially more representative and expressive than flatter networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abdu-Aguye, M.G., Gomaa, W.: Novel approaches to activity recognition based on vector autoregression and wavelet transforms. In: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 951–954. IEEE (2018)

    Google Scholar 

  2. Abdu-Aguye, M.G., Gomaa, W.: Competitive feature extraction for activity recognition based on wavelet transforms and adaptive pooling. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2019). https://doi.org/10.1109/IJCNN.2019.8852299

  3. Abdu-Aguye, M.G., Gomaa, W.: Robust human activity recognition based on deep metric learning. In: Proceedings of the 16th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, pp. 656–663. INSTICC, SciTePress (2019). https://doi.org/10.5220/0007916806560663

  4. Abdu-Aguye, M.G., Gomaa, W.: Versatl: versatile transfer learning for IMU-based activity recognition using convolutional neural networks. In: Proceedings of the 16th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, pp. 507–516. INSTICC, SciTePress (2019). https://doi.org/10.5220/0007916705070516

  5. Abdu-Aguye, M.G., Gomaa, W., Makihara, Y., Yagi, Y.: On the feasibility of on-body roaming models in human activity recognition. In: Proceedings of the 16th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, pp. 680–690. INSTICC, SciTePress (2019). https://doi.org/10.5220/0007921606800690

  6. Adel, O., Nafea, Y., Hesham, A., Goma, W.: Gait-based person identification using multiple inertial sensors. In: Proceedings of the 17th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, pp. 621–628. INSTICC, SciTePress (2020). https://doi.org/10.5220/0009791506210628

  7. Agarwal, S., Roth, D.: Learning a sparse representation for object detection. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) Computer Vision - ECCV 2002, pp. 113–127. Springer, Heidelberg (2002)

    Google Scholar 

  8. Ashry, S., Elbasiony, R., Gomaa, W.: An LSTM-based descriptor for human activities recognition using IMU sensors. In: Proceedings of the 15th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, pp. 494–501. INSTICC, SciTePress (2018). https://doi.org/10.5220/0006902404940501

  9. Ashry, S., Gomaa, W.: Descriptors for human activity recognition. In: 2019 7th International Japan-Africa Conference on Electronics, Communications, and Computations, (JAC-ECC), pp. 116–119 (2019)

    Google Scholar 

  10. Ashry, S., Gomaa, W., Abdu-Aguye, M.G., El-borae, N.: Improved IMU-based human activity recognition using hierarchical hmm dissimilarity. In: Proceedings of the 17th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, pp. 702–709. INSTICC, SciTePress (2020). https://doi.org/10.5220/0009886607020709

  11. Ashry, S., Ogawa, T., Gomaa, W.: Charm-deep: continuous human activity recognition model based on deep neural network using IMU sensors of smartwatch. IEEE Sensors J. 20(15), 8757–8770 (2020)

    Google Scholar 

  12. Baños, O., Damas, M., Pomares, H., Rojas, I., Tóth, M.A., Amft, O.: A benchmark dataset to evaluate sensor displacement in activity recognition. In: Proceedings of the 2012 ACM Conference on Ubiquitous Computing, pp. 1026–1035. ACM (2012)

    Google Scholar 

  13. Baykal, E., Dogan, H., Ercin, M.E., Ersoz, S., Ekinci, M.: Transfer learning with pre-trained deep convolutional neural networks for serous cell classification. Multimed. Tools Appl. 79(21), 15593–15611 (2020). https://doi.org/10.1007/s11042-019-07821-9

  14. Bengio, Y.: Deep learning of representations for unsupervised and transfer learning. In: Proceedings of ICML Workshop on Unsupervised and Transfer Learning, pp. 17–36 (2012)

    Google Scholar 

  15. Chikhaoui, B., Gouineau, F., Sotir, M.: A CNN based transfer learning model for automatic activity recognition from accelerometer sensors. In: International Conference on Machine Learning and Data Mining in Pattern Recognition, pp. 302–315. Springer (2018)

    Google Scholar 

  16. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800–1807 (2017)

    Google Scholar 

  17. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR09 (2009)

    Google Scholar 

  18. Elkholy, A., Hussein, M., Gomaa, W., Damen, D., Saba, E.: Efficient and robust skeleton-based quality assessment and abnormality detection in human action performance. IEEE J. Biomed. Health Inform. 24(1), 280–291 (2019)

    Google Scholar 

  19. Elkholy, A., Hussein, M.E., Gomaa, W., Damen, D., Saba, E.: A general descriptor for detecting abnormal action performance from skeletal data. In: Proceedings of the 39th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC 17). JeJu Island, S. Korea (2017)

    Google Scholar 

  20. Elkholy, A., Makihara, Y., Gomaa, W., Ahad, M.A.R., Yagi, Y.: Unsupervised gei-based gait disorders detection from different views. In: 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 5423–5426 (2019). https://doi.org/10.1109/EMBC.2019.8856294

  21. Olivas, E.S., Guerrero, J.D.M., Martinez-Sober, M., Magdalena-Benedito, J.R., Serrano, L.: Handbook of Research on Machine Learning Applications and Trends - Algorithms, Methods, and Techniques. Information Science Reference, Hershey, PA (2009)

    Google Scholar 

  22. Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings, vol. 2, pp.II (2003)

    Google Scholar 

  23. Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical Report 7694, Caltech (2007)

    Google Scholar 

  24. Gomaa, W.: Probabilistic approach to human activity recognition from accelerometer data. In: 2019 7th International Japan-Africa Conference on Electronics, Communications, and Computations, (JAC-ECC), pp. 63–66 (2019)

    Google Scholar 

  25. Gomaa, W.: Statistical and time series analysis of accelerometer signals for human activity recognition. In: 2019 14th International Conference on Computer Engineering and Systems (ICCES), pp. 351–356 (2019)

    Google Scholar 

  26. Gomaa, W.: Statistical metric-theoretic approach to activity recognition based on accelerometer data. In: Hassanien, A.E., Shaalan, K., Tolba, M.F. (eds.) Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2019, pp. 537–546. Springer International Publishing, Cham (2020)

    Google Scholar 

  27. Gomaa, W., Elbasiony, R., Ashry, S.: ADL classification based on autocorrelation function of inertial signals. In: 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 833–837 (2017). https://doi.org/10.1109/ICMLA.2017.00-53

  28. Gopalakrishnan, K., Khaitan, S.K., Choudhary, A., Agrawal, A.: Deep convolutional neural networks with transfer learning for computer vision-based data-driven pavement distress detection. Construct. Build. Mater. 157, 322–330 (2017). https://doi.org/10.1016/j.conbuildmat.2017.09.110, http://www.sciencedirect.com/science/article/pii/S0950061817319335

  29. Harley, A.W.: An interactive node-link visualization of convolutional neural networks. In: International Symposium on Visual Computing, pp. 867–877. Springer (2015)

    Google Scholar 

  30. Hasan, M.K., Aleef, T.A.: Automatic mass detection in breast using deep convolutional neural network and SVM classifier. CoRR abs/1907.04424 (2019). http://arxiv.org/abs/1907.04424

  31. Hassanein, A., Hussein, M., Gomaa, W.: Semantic analysis of crowded scenes based on non-parametric tracklet clustering. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence IJCAI-16. New York City, USA (2016)

    Google Scholar 

  32. Hassanein, A.S., Hussein, M.E., Gomaa, W., Makihara, Y., Yagi, Y.: Identifying motion pathways in highly crowded scenes: Aa non-parametric tracklet clustering approach. Comput. Vision Image Understand. 191 (2020). https://doi.org/10.1016/j.cviu.2018.08.004, http://www.sciencedirect.com/science/article/pii/S1077314218301887

  33. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

    Google Scholar 

  34. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Google Scholar 

  35. Hu, D.H., Zheng, V.W., Yang, Q.: Cross-domain activity recognition via transfer learning. Pervasive Mobile Comput. 7(3), 344–358 (2011). https://doi.org/10.1016/j.pmcj.2010.11.005, http://www.sciencedirect.com/science/article/pii/S1574119210001227. Knowledge-Driven Activity Recognition in Intelligent Environments

  36. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269 (2017)

    Google Scholar 

  37. Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep Learning. Illustrated edn. Adaptive Computation and Machine Learning series. The MIT Press (2016)

    Google Scholar 

  38. ImageNet: http://www.image-net.org

  39. Jordan, M.I., Mitchell, T.M.: Machine learning: trends, perspectives, and prospects. Science 349(6245), 255–260 (2015)

    Google Scholar 

  40. Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)

    Google Scholar 

  41. Khan, M.A.A.H., Roy, N.: Transact: transfer learning enabled activity recognition. In: 2017 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), pp. 545–550 (2017). https://doi.org/10.1109/PERCOMW.2017.7917621

  42. Khan, M.A.A.H., Roy, N.: Untran: recognizing unseen activities with unlabeled data using transfer learning. In: 2018 IEEE/ACM Third International Conference on Internet-of-Things Design and Implementation (IoTDI), pp. 37–47 (2018). https://doi.org/10.1109/IoTDI.2018.00014

  43. Li Fei-Fei, Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In: 2004 Conference on Computer Vision and Pattern Recognition Workshop, pp. 178–178 (2004)

    Google Scholar 

  44. Lu, G., Hao, Q., Kong, K., Yan, J., Li, H., Li, X.: Deep convolutional neural networks with transfer learning for neonatal pain expression recognition. In: 2018 14th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), pp. 251–256 (2018). https://doi.org/10.1109/FSKD.2018.8687129

  45. Masrour, T., El Hassani, I., Bouchama, M.S.: Deep convolutional neural networks with transfer learning for old buildings pathologies automatic detection. In: Ezziyyani, M. (ed.) Advanced Intelligent Systems for Sustainable Development (AI2SD’2019), pp. 204–216. Springer, Cham (2020)

    Google Scholar 

  46. Mostafa., A., Barghash., T.O., Assaf., A.A., Gomaa., W.: Multi-sensor gait analysis for gender recognition. In: Proceedings of the 17th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, pp. 629–636. INSTICC, SciTePress (2020). https://doi.org/10.5220/0009792006290636

  47. Moustafa, A., Hussein, M., Gomaa, W.: Gate and common pathway detection in crowd scenes using motion units and meta-tracking. In: Proceedings of the International Conference on Digital Image Computing: Techniques and Applications (DICTA 2017), Sydney, Australia (2017)

    Google Scholar 

  48. Moustafa, A.N., Gomaa, W.: Gate and common pathway detection in crowd scenes and anomaly detection using motion units and LSTM predictive models. Multimed. Tools Appl. (2020). https://doi.org/10.1007/s11042-020-08840-7

  49. Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1717–1724 (2014)

    Google Scholar 

  50. Pan, S.J., Yang, Q., et al.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010). https://doi.org/10.1109/TKDE.2009.191

  51. Ponce, J., Berg, T.L., Everingham, M., Forsyth, D.A., Hebert, M., Lazebnik, S., Marszalek, M., Schmid, C., Russell, B.C., Torralba, A., Williams, C.K.I., Zhang, J., Zisserman, A.: Dataset Issues in Object Recognition, pp. 29–48. Springer, Heidelberg (2006)

    Google Scholar 

  52. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In:  Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, Conference Track Proceedings, San Diego, CA, USA, 7–9 May 2015 (2015). URL http://arxiv.org/abs/1409.1556

  53. Su, Y., Chiu, T., Yeh, C., Huang, H., Hsu, W.H.: Transfer learning for video recognition with scarce training data. CoRR abs/1409.4127 (2014). http://arxiv.org/abs/1409.4127

  54. Thrun, S., Pratt, L.: Learning to Learn: Introduction and Overview, pp. 3–17. Springer, Boston (1998)

    Google Scholar 

  55. Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In:  Ghahramani, Z., Welling,M., Cortes, C., Lawrence, N., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 3320–3328. Curran Associates, Inc. (2014). https://proceedings.neurips.cc/paper/2014/file/375c71349b295fbe2dcdca9206f20a06-Paper.pdf

  56. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: European Conference on Computer Vision, pp. 818–833. Springer (2014)

    Google Scholar 

Download references

Acknowledgements

This work is Funded by the Science and Technology Development Fund STDF (Egypt); Project id: 42519 - “Automatic Video Surveillance System for Crowd Scenes”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Walid Gomaa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Gomaa, W. (2021). Deep Architectures in Visual Transfer Learning. In: Ahad, M.A.R., Inoue, A. (eds) Vision, Sensing and Analytics: Integrative Approaches. Intelligent Systems Reference Library, vol 207. Springer, Cham. https://doi.org/10.1007/978-3-030-75490-7_1

Download citation

Publish with us

Policies and ethics