Skip to main content
Log in

An object recognition system based on convolutional neural networks and angular resolutions

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The development of 3D object recognition often requires a huge amount of data in the training process, especially when deep learning methods are involved so that the training can be convergent. The problem is that the availability of free 3D object datasets is usually quite limited, so some researchers have proposed several techniques to overcome this problem. In this work, we propose a novel algorithm, making use of angular resolutions and convolutional neural networks for 3D object recognition, and it collects image shapes or contours from real objects by placing them on a rotating display to record the appearances from multiple angular views. The chosen angular resolution is in the range of 0-180 degrees, and the selection of viewing angle is done by a binary search. We have conducted a comparative experiment on the accuracy of 6 well-known network architectures, including GoogleNet, CaffeNet, SqueezeNet, ResNet18, ResNet32, and ResNet50, to see how far these architecture networks can adapt to the angular resolution techniques that we propose for the classification of objects outside the lab environment. We also propose another way with the use of incremental learning, where we integrate our proposed method that uses GoogleNet model with two existing weights pre-trained models, i.e., AlexNet and VGG16. In other words, our proposed method helps address the limitations of other models with the weights of existing pre-trained methods to recognize new classes that were not recognized.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Boykov YY, Jolly M-P (2001) Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In: Proceedings eighth IEEE international conference on computer vision. ICCV 2001, vol 1. IEEE, pp 105–112

  2. Chen J, Ying H, Liu X, Gu J, Feng R, Chen T, Gao H, Wu J (2020) A transfer learning based super-resolution microscopy for biopsy slice images: The joint methods perspective. IEEE/ACM Transactions on Computational Biology and Bioinformatics

  3. Durmuş H, Güneş EO, Kırcı M (2017) Disease detection on the leaves of the tomato plants by using deep learning. In: 2017 6th International conference on agro-geoinformatics. IEEE, pp 1–5

  4. Fei-Fei L, Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell 28(4):594–611

    Article  Google Scholar 

  5. Gao Z, Wang D, Xue Y, Xu G, Zhang H, Wang Y (2018) 3D object recognition based on pairwise multi-view convolutional neural networks. J Vis Commun Image Represent 56:305–315

    Article  Google Scholar 

  6. Gutstein S, Stump E (2015) Reduction of catastrophic forgetting with transfer learning and ternary output codes. In: 2015 International Joint Conference on Neural Networks (IJCNN). IEEE, pp 1–8

  7. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  8. Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: Alexnet-level accuracy with 50x fewer parameters and 0.5 mb model size. arXiv:1602.07360

  9. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia. ACM, pp 675–678

  10. Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A, et al (2017) Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences 114(13):3521–3526

    Article  MathSciNet  Google Scholar 

  11. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  12. LeCun Y, Jackel L, Bottou L, Cortes C, Denker JS, Drucker H, Guyon I, Muller UA, Sackinger E, Simard P, et al (1995) Learning algorithms for classification: A comparison on handwritten digit recognition. Neural Networks: The Statistical Mechanics Perspective 261:276

    Google Scholar 

  13. Li W, Bebis G, Bourbakis NG (2008) 3-d object recognition using 2-d views. IEEE Trans Image Process 17(11):2236–2255

    Article  MathSciNet  Google Scholar 

  14. Li Y, Hu H, Zhou G (2018) Using data augmentation in continuous authentication on smartphones. IEEE Internet of Things Journal 6(1):628–640

    Article  Google Scholar 

  15. Ma J, Wang X, Jiang J (2019) Image super-resolution via dense discriminative network. IEEE Transactions on Industrial Electronics

  16. Ma J, Zhang H, Yi P, Wang Z-Y (2019) Scscn: A separated channel-spatial convolution net with attention for single-view reconstruction. IEEE Transactions on Industrial Electronics

  17. Mikołajczyk A, Grochowski M (2018) Data augmentation for improving deep learning in image classification problem. In: 2018 international interdisciplinary PhD workshop (IIPhDW). IEEE, pp 117–122

  18. Moujahid A (2018) A practical introduction to deep learning with caffe and python. Retrieved February 19:2016

    Google Scholar 

  19. MS Windows NT kernel description. https://www.kaggle.com/alxmamaev/flowers-recognition/data. Accessed: 2018-11-25

  20. MS Windows NT kernel description. https://pjreddie.com/darknet/imagenet/#extraction/darknet19/. Accessed: 2019-04-11

  21. Mureşan H, Oltean M (2018) Fruit recognition from images using deep learning. Acta Universitatis Sapientiae Informatica 10(1):26–42

    Article  Google Scholar 

  22. Rother C, Kolmogorov V, Blake A (2004) Grabcut: Interactive foreground extraction using iterated graph cuts. In: ACM transactions on graphics (TOG), vol 23. ACM, pp 309–314

  23. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  24. Sarwar SS, Ankit A, Roy K (2017) Incremental learning in deep convolutional neural networks using partial network sharing. arXiv:1712.02719

  25. Serra J, Suris D, Miron M, Karatzoglou A (2018) Overcoming catastrophic forgetting with hard attention to the task. arXiv:1801.01423

  26. Shmelkov K, Schmid C, Alahari K (2017) Incremental learning of object detectors without catastrophic forgetting. In: Proceedings of the IEEE international conference on computer vision, pp 3400–3409

  27. Shu X, Qi G-J, Tang J, Wang J (2015) Weakly-shared deep transfer networks for heterogeneous-domain knowledge propagation. In: Proceedings of the 23rd ACM international conference on multimedia. ACM, pp 35–44

  28. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  29. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: A simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  30. Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE international conference on computer vision, pp 945–953

  31. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  32. Tang J, Shu X, Li Z, Qi G-J, Wang J (2016) Generalized deep transfer networks for knowledge propagation in heterogeneous domains. ACM Transactions on Multimedia Computing Communications, and Applications (TOMM) 12 (4s):68

    Google Scholar 

  33. Wang Z, Hu M, Zhai G (2018) Application of deep learning architectures for accurate and rapid detection of internal mechanical damage of blueberry using hyperspectral transmittance data. Sensors 18(4):1126

    Article  Google Scholar 

  34. Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: A deep representation for volumetric shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1912–1920

  35. Yang Z, Yu W, Liang P, Guo H, Xia L, Zhang F, Ma Y, Ma J (2019) Deep transfer learning for military object recognition under small training set condition. Neural Comput and Applic 31(10):6469–6478

    Article  Google Scholar 

  36. Zhang C, Zhou P, Li C, Liu L (2015) A convolutional neural network for leaves recognition using data augmentation. In: 2015 IEEE International conference on computer and information technology; Ubiquitous computing and communications; Dependable, autonomic and secure computing; Pervasive intelligence and computing. IEEE, pp 2143–2150

  37. Zhou H, Huang H, Yang X, Zhang L, Qi L (2017) Faster r-cnn for marine organism detection and recognition using data augmentation. In: Proceedings of the international conference on video and image processing. ACM, pp 56–62

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chuan-Kai Yang.

Ethics declarations

This work was supported in part by the Ministry of Science and Technology of Taiwan under the grant MOST 106-2221-E-011-148-MY3.

Conflict of Interests

Both authors have received the aforementioned funding support and both authors have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Categories and Subject Descriptors I.4.6 [Image Processing and Computer Vision]: Segmentation—Edge and Feature Detection; I.4.8 [Image Processing and Computer Vision]: Scene Analysis—Object Recognition

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lukman, A., Yang, CK. An object recognition system based on convolutional neural networks and angular resolutions. Multimed Tools Appl 80, 16059–16085 (2021). https://doi.org/10.1007/s11042-020-10312-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-10312-x

Keywords

Navigation