A Review on Deep Learning-Based Object Recognition Algorithms

Mahanty, Mohan; Bhattacharyya, Debnath; Midhunchakkaravarthy, Divya

doi:10.1007/978-981-16-8364-0_7

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1419))

303 Accesses
1 Citations

Abstract

Object recognition seems like a sci-fi prediction to a tech future, widely used in computer vision tasks. Computer vision plays a significant role in developing various object recognition applications like biometric regulation, image retrieval, security, machine inspection, medical imaging, and digital watermarking. It also plays a substantial role in the invention of autonomous automobiles, to have safe driving by detecting objects and road signals. Computer vision is an essential aspect of machine learning and deep learning, which enables new medical diagnostic methods to analyze the X-rays and internal body scanning. In this paper, we discussed a few deep learning architectures like ConvNet, Region-based CNN, Fast RCNN, Faster RCNN, YOLO family for object recognition. The perception of recognition of objects is changing from model to model. Computer-Aided Medical Diagnosis (CAMD) has brought a drastic change in the real-time gland (tissue) recognition, and assists the radiologists to interpret a medical pathology image or video in seconds. After analyzing various object recognition algorithms, we have provided a detailed report on the different object recognition algorithms used to recognize objects in images, videos, and live feeds from the webcams.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Cioffi, R., Travaglioni, M., Piscitelli, G., Petrillo, A., & De Felice, F. (2020). Artificial intelligence and machine learning applications in smart production. MDPI AG (vol. 12, no. 2, p. 492). https://doi.org/10.3390/su12020492.
https://www.cs.toronto.edu/hinton/absps/NatureDeepReview.pdf. Accessed 14 Aug 2020.
A Comprehensive Guide to CNN. https://towardsdatascience.com/a-comprehensive-guide-convolutional-neural-networks-the-eli5-way-3bd2b1164a53. Accessed 14 Aug 2020.
Kumar, P., Nagar, P., Arora, C., & Gupta, A. (2018). U-Segnet: Fully convolutional neural network based automated brain tissue segmentation tool. In Proceedings - International Conference on Image Processing, ICIP, pp. 3503–3507. https://doi.org/10.1109/ICIP.2018.8451295.
Liu, B., Zhao, W., & Sun, Q. (2017). Study of object detection based on Faster R-CNN. In Proceedings - 2017 Chinese Automation Congress, CAC 2017, vol. 2017-January, pp. 6233–6236. https://doi.org/10.1109/CAC.2017.8243900.
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2020). You only look once: unified, real-time object detection. http://pjreddie.com/yolo/. Accessed 14 Aug 2020.
Pathak, A. R., Pandey, M., & Rautaray, S. (2018). Application of deep learning for object detection. In Procedia Computer Science, 132. https://doi.org/10.1016/j.procs.2018.05.144.
Singh, B., Najibi, M., & Davis, L.S. (2018). SNIPER: Efficient multi-scale training0 Accessed 24 Sep 2020. https://arxiv.org/abs/1805.09300.
Girshick, R. (2015). Fast R-CNN. In 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 1440–1448. https://doi.org/10.1109/ICCV.2015.169.
Ren, S., He, K., Girshick, R., & Sun, J. (2020). Faster R-CNN: Towards real-time object detection with region proposal networks. http://image-net.org/challenges/LSVRC/2015/results. Accessed 14 Aug 2020.
Liu, Y. (2018). An improved faster R-CNN for object detection. In Proceedings - 2018 11th International Symposium on Computational Intelligence and Design, ISCID 2018, vol. 2, pp. 119–123. https://doi.org/10.1109/ISCID.2018.10128.
Zhu, T., Zhao, C., Wang, J., Zhao, X., Wu, Y., & Lu, H. (2017). CoupleNet: Coupling global structure with local parts for object detection. In Proceedings of the IEEE International Conference on Computer Vision, vol. 2017-October, pp. 4146–4154. https://doi.org/10.1109/ICCV.2017.444.
Pirinen, A., & Sminchisescu, C. (2018). Deep reinforcement learning of region proposal networks for object detection. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6945–6954. https://doi.org/10.1109/CVPR.2018.00726.
Tian, Z., Shen, C., Chen, H., & He, T. (2019). FCOS: Fully convolutional one-stage object detection. In Proceedings of the IEEE International Conference on Computer Vision, vol. 2019-October, pp. 9626–9635. https://doi.org/10.1109/ICCV.2019.00972.
Ghiasi, G., Lin, T. Y., Le, Q. V. (2019). NAS-FPN: Learning scalable feature pyramid architecture for object detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2019-June, pp. 7029–7038. https://doi.org/10.1109/CVPR.2019.00720.
Zhang, H., Wu, C., Zhang, Z., Zhu, Y., Zhang, Z., Lin, H., Sun, Y., He, T., Mueller , J., Manmatha, R., Li, M., & Smola, A. (2020). ResNeSt: Split-attention networks, 2004, 08955.
Google Scholar
Li, Y., Chen, Y., Wang, N., & Zhang, Z. (2019). Scale-aware trident networks for object detection, 1901, 01892.
Google Scholar
Zhou, X., Zhuo, J., & Krahenbuhl, P. (2019). Bottom-up object detection by grouping extreme and center points. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2019-June, pp. 850–859. https://doi.org/10.1109/CVPR.2019.00094.
Qin, Z., et al., (2019). ThunderNet: Towards real-time generic object detection on mobile devices. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), vol. 2019-October, pp. 6717–6726. https://doi.org/10.1109/ICCV.2019.00682.
Bhargava, P. (2019). On generalizing detection models for unconstrained environments. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 4296–4301. https://doi.org/10.1109/ICCVW.2019.00529.
Wang, J., Sun, K., & Cheng, T., et al. (2020). Deep high-resolution representation learning for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/tpami.2020.2983686.
Tan, M., Pang, R., & Le, Q. V. (2020). EfficientDet: Scalable and efficient object detection. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 10778–10787. https://doi.org/10.1109/CVPR42600.2020.01079.
Du, X., et al., (2020). SpineNet: Learning scale-permuted backbone for recognition and localization. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 11589–11598. https://doi.org/10.1109/CVPR42600.2020.01161.
Bochkovskiy, A., Wang, C.-Y., & Liao, H.=Y.M. (2020). YOLOv4: Optimal speed and accuracy of object detection. http://arxiv.org/abs/2004.10934. Accessed 24 Oct 2020.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollar, P., Zitnick, C. L. (2020). Microsoft COCO: Common objects in context.
Google Scholar
COCO Benchmark (Real-Time Object Detection) | real-time-object-detection-on-coco. Accessed 24 Oct 2020.
Google Scholar
Everingham, M., Van Gool, L., Williams, C. K. I., Winn J., & Zisserman A. (2007). The PASCAL Visual Object Classes challenge 2007, VOC2007)} results. In PASCAL VOC 2007 Benchmark (Object Detection). Accessed 09 Nov 2020.
Google Scholar
Redmon, J., & Farhadi, A. (2020). YOLOv3: An incremental improvement. Accessed 23 Sep 2020.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Multimedia, Lincoln University College, Kuala Lumpur, Malaysia
Mohan Mahanty & Divya Midhunchakkaravarthy
Department of Computer Science and Engineering, K L Deemed To Be University, KLEF, Guntur, 522502, India
Debnath Bhattacharyya

Authors

Mohan Mahanty
View author publications
You can also search for this author in PubMed Google Scholar
Debnath Bhattacharyya
View author publications
You can also search for this author in PubMed Google Scholar
Divya Midhunchakkaravarthy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohan Mahanty .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, India
Debnath Bhattacharyya
Department of Computer Science and Engineering, Jadavpur University, Kolkata, India
Sanjoy Kumar Saha
Shenzhen University, University Town, Shenzhen, China
Philippe Fournier-Viger

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mahanty, M., Bhattacharyya, D., Midhunchakkaravarthy, D. (2022). A Review on Deep Learning-Based Object Recognition Algorithms. In: Bhattacharyya, D., Saha, S.K., Fournier-Viger, P. (eds) Machine Intelligence and Soft Computing. Advances in Intelligent Systems and Computing, vol 1419. Springer, Singapore. https://doi.org/10.1007/978-981-16-8364-0_7

Download citation

DOI: https://doi.org/10.1007/978-981-16-8364-0_7
Published: 22 February 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-8363-3
Online ISBN: 978-981-16-8364-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics