An object recognition system based on convolutional neural networks and angular resolutions

Lukman, Achmad; Yang, Chuan-Kai

doi:10.1007/s11042-020-10312-x

An object recognition system based on convolutional neural networks and angular resolutions

Published: 08 February 2021

Volume 80, pages 16059–16085, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Achmad Lukman¹ &
Chuan-Kai Yang¹

271 Accesses
3 Citations
Explore all metrics

Abstract

The development of 3D object recognition often requires a huge amount of data in the training process, especially when deep learning methods are involved so that the training can be convergent. The problem is that the availability of free 3D object datasets is usually quite limited, so some researchers have proposed several techniques to overcome this problem. In this work, we propose a novel algorithm, making use of angular resolutions and convolutional neural networks for 3D object recognition, and it collects image shapes or contours from real objects by placing them on a rotating display to record the appearances from multiple angular views. The chosen angular resolution is in the range of 0-180 degrees, and the selection of viewing angle is done by a binary search. We have conducted a comparative experiment on the accuracy of 6 well-known network architectures, including GoogleNet, CaffeNet, SqueezeNet, ResNet18, ResNet32, and ResNet50, to see how far these architecture networks can adapt to the angular resolution techniques that we propose for the classification of objects outside the lab environment. We also propose another way with the use of incremental learning, where we integrate our proposed method that uses GoogleNet model with two existing weights pre-trained models, i.e., AlexNet and VGG16. In other words, our proposed method helps address the limitations of other models with the weights of existing pre-trained methods to recognize new classes that were not recognized.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

3D convolutional neural network for object recognition: a review

Article 11 December 2018

Comparison Study on Convolution Neural Networks (CNNs) vs. Human Visual System (HVS)

3D Object Recognition Based on Volumetric Representation Using Convolutional Neural Networks

References

Boykov YY, Jolly M-P (2001) Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In: Proceedings eighth IEEE international conference on computer vision. ICCV 2001, vol 1. IEEE, pp 105–112
Chen J, Ying H, Liu X, Gu J, Feng R, Chen T, Gao H, Wu J (2020) A transfer learning based super-resolution microscopy for biopsy slice images: The joint methods perspective. IEEE/ACM Transactions on Computational Biology and Bioinformatics
Durmuş H, Güneş EO, Kırcı M (2017) Disease detection on the leaves of the tomato plants by using deep learning. In: 2017 6th International conference on agro-geoinformatics. IEEE, pp 1–5
Fei-Fei L, Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell 28(4):594–611
Article Google Scholar
Gao Z, Wang D, Xue Y, Xu G, Zhang H, Wang Y (2018) 3D object recognition based on pairwise multi-view convolutional neural networks. J Vis Commun Image Represent 56:305–315
Article Google Scholar
Gutstein S, Stump E (2015) Reduction of catastrophic forgetting with transfer learning and ternary output codes. In: 2015 International Joint Conference on Neural Networks (IJCNN). IEEE, pp 1–8
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: Alexnet-level accuracy with 50x fewer parameters and 0.5 mb model size. arXiv:1602.07360
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia. ACM, pp 675–678
Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A, et al (2017) Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences 114(13):3521–3526
Article MathSciNet Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
LeCun Y, Jackel L, Bottou L, Cortes C, Denker JS, Drucker H, Guyon I, Muller UA, Sackinger E, Simard P, et al (1995) Learning algorithms for classification: A comparison on handwritten digit recognition. Neural Networks: The Statistical Mechanics Perspective 261:276
Google Scholar
Li W, Bebis G, Bourbakis NG (2008) 3-d object recognition using 2-d views. IEEE Trans Image Process 17(11):2236–2255
Article MathSciNet Google Scholar
Li Y, Hu H, Zhou G (2018) Using data augmentation in continuous authentication on smartphones. IEEE Internet of Things Journal 6(1):628–640
Article Google Scholar
Ma J, Wang X, Jiang J (2019) Image super-resolution via dense discriminative network. IEEE Transactions on Industrial Electronics
Ma J, Zhang H, Yi P, Wang Z-Y (2019) Scscn: A separated channel-spatial convolution net with attention for single-view reconstruction. IEEE Transactions on Industrial Electronics
Mikołajczyk A, Grochowski M (2018) Data augmentation for improving deep learning in image classification problem. In: 2018 international interdisciplinary PhD workshop (IIPhDW). IEEE, pp 117–122
Moujahid A (2018) A practical introduction to deep learning with caffe and python. Retrieved February 19:2016
Google Scholar
MS Windows NT kernel description. https://www.kaggle.com/alxmamaev/flowers-recognition/data. Accessed: 2018-11-25
MS Windows NT kernel description. https://pjreddie.com/darknet/imagenet/#extraction/darknet19/. Accessed: 2019-04-11
Mureşan H, Oltean M (2018) Fruit recognition from images using deep learning. Acta Universitatis Sapientiae Informatica 10(1):26–42
Article Google Scholar
Rother C, Kolmogorov V, Blake A (2004) Grabcut: Interactive foreground extraction using iterated graph cuts. In: ACM transactions on graphics (TOG), vol 23. ACM, pp 309–314
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article MathSciNet Google Scholar
Sarwar SS, Ankit A, Roy K (2017) Incremental learning in deep convolutional neural networks using partial network sharing. arXiv:1712.02719
Serra J, Suris D, Miron M, Karatzoglou A (2018) Overcoming catastrophic forgetting with hard attention to the task. arXiv:1801.01423
Shmelkov K, Schmid C, Alahari K (2017) Incremental learning of object detectors without catastrophic forgetting. In: Proceedings of the IEEE international conference on computer vision, pp 3400–3409
Shu X, Qi G-J, Tang J, Wang J (2015) Weakly-shared deep transfer networks for heterogeneous-domain knowledge propagation. In: Proceedings of the 23rd ACM international conference on multimedia. ACM, pp 35–44
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: A simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE international conference on computer vision, pp 945–953
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Tang J, Shu X, Li Z, Qi G-J, Wang J (2016) Generalized deep transfer networks for knowledge propagation in heterogeneous domains. ACM Transactions on Multimedia Computing Communications, and Applications (TOMM) 12 (4s):68
Google Scholar
Wang Z, Hu M, Zhai G (2018) Application of deep learning architectures for accurate and rapid detection of internal mechanical damage of blueberry using hyperspectral transmittance data. Sensors 18(4):1126
Article Google Scholar
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: A deep representation for volumetric shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1912–1920
Yang Z, Yu W, Liang P, Guo H, Xia L, Zhang F, Ma Y, Ma J (2019) Deep transfer learning for military object recognition under small training set condition. Neural Comput and Applic 31(10):6469–6478
Article Google Scholar
Zhang C, Zhou P, Li C, Liu L (2015) A convolutional neural network for leaves recognition using data augmentation. In: 2015 IEEE International conference on computer and information technology; Ubiquitous computing and communications; Dependable, autonomic and secure computing; Pervasive intelligence and computing. IEEE, pp 2143–2150
Zhou H, Huang H, Yang X, Zhang L, Qi L (2017) Faster r-cnn for marine organism detection and recognition using data augmentation. In: Proceedings of the international conference on video and image processing. ACM, pp 56–62

Download references

Author information

Authors and Affiliations

Department of Information Management, National Taiwan University of Science and Technology, No. 43, Sec. 4, Keelung Road, Taipei, 106, Taiwan, Republic of China
Achmad Lukman & Chuan-Kai Yang

Authors

Achmad Lukman
View author publications
You can also search for this author in PubMed Google Scholar
Chuan-Kai Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chuan-Kai Yang.

Ethics declarations

This work was supported in part by the Ministry of Science and Technology of Taiwan under the grant MOST 106-2221-E-011-148-MY3.

Conflict of Interests

Both authors have received the aforementioned funding support and both authors have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Categories and Subject Descriptors I.4.6 [Image Processing and Computer Vision]: Segmentation—Edge and Feature Detection; I.4.8 [Image Processing and Computer Vision]: Scene Analysis—Object Recognition

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lukman, A., Yang, CK. An object recognition system based on convolutional neural networks and angular resolutions. Multimed Tools Appl 80, 16059–16085 (2021). https://doi.org/10.1007/s11042-020-10312-x

Download citation

Received: 10 October 2019
Revised: 30 October 2020
Accepted: 22 December 2020
Published: 08 February 2021
Issue Date: April 2021
DOI: https://doi.org/10.1007/s11042-020-10312-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An object recognition system based on convolutional neural networks and angular resolutions

Abstract

Access this article

Similar content being viewed by others

3D convolutional neural network for object recognition: a review

Comparison Study on Convolution Neural Networks (CNNs) vs. Human Visual System (HVS)

3D Object Recognition Based on Volumetric Representation Using Convolutional Neural Networks

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An object recognition system based on convolutional neural networks and angular resolutions

Abstract

Access this article

Similar content being viewed by others

3D convolutional neural network for object recognition: a review

Comparison Study on Convolution Neural Networks (CNNs) vs. Human Visual System (HVS)

3D Object Recognition Based on Volumetric Representation Using Convolutional Neural Networks

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation