Abstract
The development of 3D object recognition often requires a huge amount of data in the training process, especially when deep learning methods are involved so that the training can be convergent. The problem is that the availability of free 3D object datasets is usually quite limited, so some researchers have proposed several techniques to overcome this problem. In this work, we propose a novel algorithm, making use of angular resolutions and convolutional neural networks for 3D object recognition, and it collects image shapes or contours from real objects by placing them on a rotating display to record the appearances from multiple angular views. The chosen angular resolution is in the range of 0-180 degrees, and the selection of viewing angle is done by a binary search. We have conducted a comparative experiment on the accuracy of 6 well-known network architectures, including GoogleNet, CaffeNet, SqueezeNet, ResNet18, ResNet32, and ResNet50, to see how far these architecture networks can adapt to the angular resolution techniques that we propose for the classification of objects outside the lab environment. We also propose another way with the use of incremental learning, where we integrate our proposed method that uses GoogleNet model with two existing weights pre-trained models, i.e., AlexNet and VGG16. In other words, our proposed method helps address the limitations of other models with the weights of existing pre-trained methods to recognize new classes that were not recognized.
Similar content being viewed by others
References
Boykov YY, Jolly M-P (2001) Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In: Proceedings eighth IEEE international conference on computer vision. ICCV 2001, vol 1. IEEE, pp 105–112
Chen J, Ying H, Liu X, Gu J, Feng R, Chen T, Gao H, Wu J (2020) A transfer learning based super-resolution microscopy for biopsy slice images: The joint methods perspective. IEEE/ACM Transactions on Computational Biology and Bioinformatics
Durmuş H, Güneş EO, Kırcı M (2017) Disease detection on the leaves of the tomato plants by using deep learning. In: 2017 6th International conference on agro-geoinformatics. IEEE, pp 1–5
Fei-Fei L, Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell 28(4):594–611
Gao Z, Wang D, Xue Y, Xu G, Zhang H, Wang Y (2018) 3D object recognition based on pairwise multi-view convolutional neural networks. J Vis Commun Image Represent 56:305–315
Gutstein S, Stump E (2015) Reduction of catastrophic forgetting with transfer learning and ternary output codes. In: 2015 International Joint Conference on Neural Networks (IJCNN). IEEE, pp 1–8
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: Alexnet-level accuracy with 50x fewer parameters and 0.5 mb model size. arXiv:1602.07360
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia. ACM, pp 675–678
Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A, et al (2017) Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences 114(13):3521–3526
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
LeCun Y, Jackel L, Bottou L, Cortes C, Denker JS, Drucker H, Guyon I, Muller UA, Sackinger E, Simard P, et al (1995) Learning algorithms for classification: A comparison on handwritten digit recognition. Neural Networks: The Statistical Mechanics Perspective 261:276
Li W, Bebis G, Bourbakis NG (2008) 3-d object recognition using 2-d views. IEEE Trans Image Process 17(11):2236–2255
Li Y, Hu H, Zhou G (2018) Using data augmentation in continuous authentication on smartphones. IEEE Internet of Things Journal 6(1):628–640
Ma J, Wang X, Jiang J (2019) Image super-resolution via dense discriminative network. IEEE Transactions on Industrial Electronics
Ma J, Zhang H, Yi P, Wang Z-Y (2019) Scscn: A separated channel-spatial convolution net with attention for single-view reconstruction. IEEE Transactions on Industrial Electronics
Mikołajczyk A, Grochowski M (2018) Data augmentation for improving deep learning in image classification problem. In: 2018 international interdisciplinary PhD workshop (IIPhDW). IEEE, pp 117–122
Moujahid A (2018) A practical introduction to deep learning with caffe and python. Retrieved February 19:2016
MS Windows NT kernel description. https://www.kaggle.com/alxmamaev/flowers-recognition/data. Accessed: 2018-11-25
MS Windows NT kernel description. https://pjreddie.com/darknet/imagenet/#extraction/darknet19/. Accessed: 2019-04-11
Mureşan H, Oltean M (2018) Fruit recognition from images using deep learning. Acta Universitatis Sapientiae Informatica 10(1):26–42
Rother C, Kolmogorov V, Blake A (2004) Grabcut: Interactive foreground extraction using iterated graph cuts. In: ACM transactions on graphics (TOG), vol 23. ACM, pp 309–314
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Sarwar SS, Ankit A, Roy K (2017) Incremental learning in deep convolutional neural networks using partial network sharing. arXiv:1712.02719
Serra J, Suris D, Miron M, Karatzoglou A (2018) Overcoming catastrophic forgetting with hard attention to the task. arXiv:1801.01423
Shmelkov K, Schmid C, Alahari K (2017) Incremental learning of object detectors without catastrophic forgetting. In: Proceedings of the IEEE international conference on computer vision, pp 3400–3409
Shu X, Qi G-J, Tang J, Wang J (2015) Weakly-shared deep transfer networks for heterogeneous-domain knowledge propagation. In: Proceedings of the 23rd ACM international conference on multimedia. ACM, pp 35–44
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: A simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE international conference on computer vision, pp 945–953
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Tang J, Shu X, Li Z, Qi G-J, Wang J (2016) Generalized deep transfer networks for knowledge propagation in heterogeneous domains. ACM Transactions on Multimedia Computing Communications, and Applications (TOMM) 12 (4s):68
Wang Z, Hu M, Zhai G (2018) Application of deep learning architectures for accurate and rapid detection of internal mechanical damage of blueberry using hyperspectral transmittance data. Sensors 18(4):1126
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: A deep representation for volumetric shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1912–1920
Yang Z, Yu W, Liang P, Guo H, Xia L, Zhang F, Ma Y, Ma J (2019) Deep transfer learning for military object recognition under small training set condition. Neural Comput and Applic 31(10):6469–6478
Zhang C, Zhou P, Li C, Liu L (2015) A convolutional neural network for leaves recognition using data augmentation. In: 2015 IEEE International conference on computer and information technology; Ubiquitous computing and communications; Dependable, autonomic and secure computing; Pervasive intelligence and computing. IEEE, pp 2143–2150
Zhou H, Huang H, Yang X, Zhang L, Qi L (2017) Faster r-cnn for marine organism detection and recognition using data augmentation. In: Proceedings of the international conference on video and image processing. ACM, pp 56–62
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
This work was supported in part by the Ministry of Science and Technology of Taiwan under the grant MOST 106-2221-E-011-148-MY3.
Conflict of Interests
Both authors have received the aforementioned funding support and both authors have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Categories and Subject Descriptors I.4.6 [Image Processing and Computer Vision]: Segmentation—Edge and Feature Detection; I.4.8 [Image Processing and Computer Vision]: Scene Analysis—Object Recognition
Rights and permissions
About this article
Cite this article
Lukman, A., Yang, CK. An object recognition system based on convolutional neural networks and angular resolutions. Multimed Tools Appl 80, 16059–16085 (2021). https://doi.org/10.1007/s11042-020-10312-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-10312-x