Skip to main content

A Review on Deep Learning-Based Object Recognition Algorithms

  • Conference paper
  • First Online:
Book cover Machine Intelligence and Soft Computing

Abstract

Object recognition seems like a sci-fi prediction to a tech future, widely used in computer vision tasks. Computer vision plays a significant role in developing various object recognition applications like biometric regulation, image retrieval, security, machine inspection, medical imaging, and digital watermarking. It also plays a substantial role in the invention of autonomous automobiles, to have safe driving by detecting objects and road signals. Computer vision is an essential aspect of machine learning and deep learning, which enables new medical diagnostic methods to analyze the X-rays and internal body scanning. In this paper, we discussed a few deep learning architectures like ConvNet, Region-based CNN, Fast RCNN, Faster RCNN, YOLO family for object recognition. The perception of recognition of objects is changing from model to model. Computer-Aided Medical Diagnosis (CAMD) has brought a drastic change in the real-time gland (tissue) recognition, and assists the radiologists to interpret a medical pathology image or video in seconds. After analyzing various object recognition algorithms, we have provided a detailed report on the different object recognition algorithms used to recognize objects in images, videos, and live feeds from the webcams.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cioffi, R., Travaglioni, M., Piscitelli, G., Petrillo, A., & De Felice, F. (2020). Artificial intelligence and machine learning applications in smart production. MDPI AG (vol. 12, no. 2, p. 492). https://doi.org/10.3390/su12020492.

  2. https://www.cs.toronto.edu/hinton/absps/NatureDeepReview.pdf. Accessed 14 Aug 2020.

  3. A Comprehensive Guide to CNN. https://towardsdatascience.com/a-comprehensive-guide-convolutional-neural-networks-the-eli5-way-3bd2b1164a53. Accessed 14 Aug 2020.

  4. Kumar, P., Nagar, P., Arora, C., & Gupta, A. (2018). U-Segnet: Fully convolutional neural network based automated brain tissue segmentation tool. In Proceedings - International Conference on Image Processing, ICIP, pp. 3503–3507. https://doi.org/10.1109/ICIP.2018.8451295.

  5. Liu, B., Zhao, W., & Sun, Q. (2017). Study of object detection based on Faster R-CNN. In Proceedings - 2017 Chinese Automation Congress, CAC 2017, vol. 2017-January, pp. 6233–6236. https://doi.org/10.1109/CAC.2017.8243900.

  6. Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2020). You only look once: unified, real-time object detection. http://pjreddie.com/yolo/. Accessed 14 Aug 2020.

  7. Pathak, A. R., Pandey, M., & Rautaray, S. (2018). Application of deep learning for object detection. In Procedia Computer Science, 132. https://doi.org/10.1016/j.procs.2018.05.144.

  8. Singh, B., Najibi, M., & Davis, L.S. (2018). SNIPER: Efficient multi-scale training0 Accessed 24 Sep 2020. https://arxiv.org/abs/1805.09300.

  9. Girshick, R. (2015). Fast R-CNN. In 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 1440–1448. https://doi.org/10.1109/ICCV.2015.169.

  10. Ren, S., He, K., Girshick, R., & Sun, J. (2020). Faster R-CNN: Towards real-time object detection with region proposal networks. http://image-net.org/challenges/LSVRC/2015/results. Accessed 14 Aug 2020.

  11. Liu, Y. (2018). An improved faster R-CNN for object detection. In Proceedings - 2018 11th International Symposium on Computational Intelligence and Design, ISCID 2018, vol. 2, pp. 119–123. https://doi.org/10.1109/ISCID.2018.10128.

  12. Zhu, T., Zhao, C., Wang, J., Zhao, X., Wu, Y., & Lu, H. (2017). CoupleNet: Coupling global structure with local parts for object detection. In Proceedings of the IEEE International Conference on Computer Vision, vol. 2017-October, pp. 4146–4154. https://doi.org/10.1109/ICCV.2017.444.

  13. Pirinen, A., & Sminchisescu, C. (2018). Deep reinforcement learning of region proposal networks for object detection. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6945–6954. https://doi.org/10.1109/CVPR.2018.00726.

  14. Tian, Z., Shen, C., Chen, H., & He, T. (2019). FCOS: Fully convolutional one-stage object detection. In Proceedings of the IEEE International Conference on Computer Vision, vol. 2019-October, pp. 9626–9635. https://doi.org/10.1109/ICCV.2019.00972.

  15. Ghiasi, G., Lin, T. Y., Le, Q. V. (2019). NAS-FPN: Learning scalable feature pyramid architecture for object detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2019-June, pp. 7029–7038. https://doi.org/10.1109/CVPR.2019.00720.

  16. Zhang, H., Wu, C., Zhang, Z., Zhu, Y., Zhang, Z., Lin, H., Sun, Y., He, T., Mueller , J., Manmatha, R., Li, M., & Smola, A. (2020). ResNeSt: Split-attention networks, 2004, 08955.

    Google Scholar 

  17. Li, Y., Chen, Y., Wang, N., & Zhang, Z. (2019). Scale-aware trident networks for object detection, 1901, 01892.

    Google Scholar 

  18. Zhou, X., Zhuo, J., & Krahenbuhl, P. (2019). Bottom-up object detection by grouping extreme and center points. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2019-June, pp. 850–859. https://doi.org/10.1109/CVPR.2019.00094.

  19. Qin, Z., et al., (2019). ThunderNet: Towards real-time generic object detection on mobile devices. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), vol. 2019-October, pp. 6717–6726. https://doi.org/10.1109/ICCV.2019.00682.

  20. Bhargava, P. (2019). On generalizing detection models for unconstrained environments. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 4296–4301. https://doi.org/10.1109/ICCVW.2019.00529.

  21. Wang, J., Sun, K., & Cheng, T., et al. (2020). Deep high-resolution representation learning for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/tpami.2020.2983686.

  22. Tan, M., Pang, R., & Le, Q. V. (2020). EfficientDet: Scalable and efficient object detection. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 10778–10787. https://doi.org/10.1109/CVPR42600.2020.01079.

  23. Du, X., et al., (2020). SpineNet: Learning scale-permuted backbone for recognition and localization. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 11589–11598. https://doi.org/10.1109/CVPR42600.2020.01161.

  24. Bochkovskiy, A., Wang, C.-Y., & Liao, H.=Y.M. (2020). YOLOv4: Optimal speed and accuracy of object detection. http://arxiv.org/abs/2004.10934. Accessed 24 Oct 2020.

  25. Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollar, P., Zitnick, C. L. (2020). Microsoft COCO: Common objects in context.

    Google Scholar 

  26. COCO Benchmark (Real-Time Object Detection) | real-time-object-detection-on-coco. Accessed 24 Oct 2020.

    Google Scholar 

  27. Everingham, M., Van Gool, L., Williams, C. K. I., Winn J., & Zisserman A. (2007). The PASCAL Visual Object Classes challenge 2007, VOC2007)} results. In PASCAL VOC 2007 Benchmark (Object Detection). Accessed 09 Nov 2020.

    Google Scholar 

  28. Redmon, J., & Farhadi, A. (2020). YOLOv3: An incremental improvement. Accessed 23 Sep 2020.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohan Mahanty .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mahanty, M., Bhattacharyya, D., Midhunchakkaravarthy, D. (2022). A Review on Deep Learning-Based Object Recognition Algorithms. In: Bhattacharyya, D., Saha, S.K., Fournier-Viger, P. (eds) Machine Intelligence and Soft Computing. Advances in Intelligent Systems and Computing, vol 1419. Springer, Singapore. https://doi.org/10.1007/978-981-16-8364-0_7

Download citation

Publish with us

Policies and ethics