Abstract
Object recognition seems like a sci-fi prediction to a tech future, widely used in computer vision tasks. Computer vision plays a significant role in developing various object recognition applications like biometric regulation, image retrieval, security, machine inspection, medical imaging, and digital watermarking. It also plays a substantial role in the invention of autonomous automobiles, to have safe driving by detecting objects and road signals. Computer vision is an essential aspect of machine learning and deep learning, which enables new medical diagnostic methods to analyze the X-rays and internal body scanning. In this paper, we discussed a few deep learning architectures like ConvNet, Region-based CNN, Fast RCNN, Faster RCNN, YOLO family for object recognition. The perception of recognition of objects is changing from model to model. Computer-Aided Medical Diagnosis (CAMD) has brought a drastic change in the real-time gland (tissue) recognition, and assists the radiologists to interpret a medical pathology image or video in seconds. After analyzing various object recognition algorithms, we have provided a detailed report on the different object recognition algorithms used to recognize objects in images, videos, and live feeds from the webcams.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cioffi, R., Travaglioni, M., Piscitelli, G., Petrillo, A., & De Felice, F. (2020). Artificial intelligence and machine learning applications in smart production. MDPI AG (vol. 12, no. 2, p. 492). https://doi.org/10.3390/su12020492.
https://www.cs.toronto.edu/hinton/absps/NatureDeepReview.pdf. Accessed 14 Aug 2020.
A Comprehensive Guide to CNN. https://towardsdatascience.com/a-comprehensive-guide-convolutional-neural-networks-the-eli5-way-3bd2b1164a53. Accessed 14 Aug 2020.
Kumar, P., Nagar, P., Arora, C., & Gupta, A. (2018). U-Segnet: Fully convolutional neural network based automated brain tissue segmentation tool. In Proceedings - International Conference on Image Processing, ICIP, pp. 3503–3507. https://doi.org/10.1109/ICIP.2018.8451295.
Liu, B., Zhao, W., & Sun, Q. (2017). Study of object detection based on Faster R-CNN. In Proceedings - 2017 Chinese Automation Congress, CAC 2017, vol. 2017-January, pp. 6233–6236. https://doi.org/10.1109/CAC.2017.8243900.
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2020). You only look once: unified, real-time object detection. http://pjreddie.com/yolo/. Accessed 14 Aug 2020.
Pathak, A. R., Pandey, M., & Rautaray, S. (2018). Application of deep learning for object detection. In Procedia Computer Science, 132. https://doi.org/10.1016/j.procs.2018.05.144.
Singh, B., Najibi, M., & Davis, L.S. (2018). SNIPER: Efficient multi-scale training0 Accessed 24 Sep 2020. https://arxiv.org/abs/1805.09300.
Girshick, R. (2015). Fast R-CNN. In 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 1440–1448. https://doi.org/10.1109/ICCV.2015.169.
Ren, S., He, K., Girshick, R., & Sun, J. (2020). Faster R-CNN: Towards real-time object detection with region proposal networks. http://image-net.org/challenges/LSVRC/2015/results. Accessed 14 Aug 2020.
Liu, Y. (2018). An improved faster R-CNN for object detection. In Proceedings - 2018 11th International Symposium on Computational Intelligence and Design, ISCID 2018, vol. 2, pp. 119–123. https://doi.org/10.1109/ISCID.2018.10128.
Zhu, T., Zhao, C., Wang, J., Zhao, X., Wu, Y., & Lu, H. (2017). CoupleNet: Coupling global structure with local parts for object detection. In Proceedings of the IEEE International Conference on Computer Vision, vol. 2017-October, pp. 4146–4154. https://doi.org/10.1109/ICCV.2017.444.
Pirinen, A., & Sminchisescu, C. (2018). Deep reinforcement learning of region proposal networks for object detection. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6945–6954. https://doi.org/10.1109/CVPR.2018.00726.
Tian, Z., Shen, C., Chen, H., & He, T. (2019). FCOS: Fully convolutional one-stage object detection. In Proceedings of the IEEE International Conference on Computer Vision, vol. 2019-October, pp. 9626–9635. https://doi.org/10.1109/ICCV.2019.00972.
Ghiasi, G., Lin, T. Y., Le, Q. V. (2019). NAS-FPN: Learning scalable feature pyramid architecture for object detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2019-June, pp. 7029–7038. https://doi.org/10.1109/CVPR.2019.00720.
Zhang, H., Wu, C., Zhang, Z., Zhu, Y., Zhang, Z., Lin, H., Sun, Y., He, T., Mueller , J., Manmatha, R., Li, M., & Smola, A. (2020). ResNeSt: Split-attention networks, 2004, 08955.
Li, Y., Chen, Y., Wang, N., & Zhang, Z. (2019). Scale-aware trident networks for object detection, 1901, 01892.
Zhou, X., Zhuo, J., & Krahenbuhl, P. (2019). Bottom-up object detection by grouping extreme and center points. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2019-June, pp. 850–859. https://doi.org/10.1109/CVPR.2019.00094.
Qin, Z., et al., (2019). ThunderNet: Towards real-time generic object detection on mobile devices. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), vol. 2019-October, pp. 6717–6726. https://doi.org/10.1109/ICCV.2019.00682.
Bhargava, P. (2019). On generalizing detection models for unconstrained environments. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 4296–4301. https://doi.org/10.1109/ICCVW.2019.00529.
Wang, J., Sun, K., & Cheng, T., et al. (2020). Deep high-resolution representation learning for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/tpami.2020.2983686.
Tan, M., Pang, R., & Le, Q. V. (2020). EfficientDet: Scalable and efficient object detection. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 10778–10787. https://doi.org/10.1109/CVPR42600.2020.01079.
Du, X., et al., (2020). SpineNet: Learning scale-permuted backbone for recognition and localization. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 11589–11598. https://doi.org/10.1109/CVPR42600.2020.01161.
Bochkovskiy, A., Wang, C.-Y., & Liao, H.=Y.M. (2020). YOLOv4: Optimal speed and accuracy of object detection. http://arxiv.org/abs/2004.10934. Accessed 24 Oct 2020.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollar, P., Zitnick, C. L. (2020). Microsoft COCO: Common objects in context.
COCO Benchmark (Real-Time Object Detection) | real-time-object-detection-on-coco. Accessed 24 Oct 2020.
Everingham, M., Van Gool, L., Williams, C. K. I., Winn J., & Zisserman A. (2007). The PASCAL Visual Object Classes challenge 2007, VOC2007)} results. In PASCAL VOC 2007 Benchmark (Object Detection). Accessed 09 Nov 2020.
Redmon, J., & Farhadi, A. (2020). YOLOv3: An incremental improvement. Accessed 23 Sep 2020.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Mahanty, M., Bhattacharyya, D., Midhunchakkaravarthy, D. (2022). A Review on Deep Learning-Based Object Recognition Algorithms. In: Bhattacharyya, D., Saha, S.K., Fournier-Viger, P. (eds) Machine Intelligence and Soft Computing. Advances in Intelligent Systems and Computing, vol 1419. Springer, Singapore. https://doi.org/10.1007/978-981-16-8364-0_7
Download citation
DOI: https://doi.org/10.1007/978-981-16-8364-0_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-8363-3
Online ISBN: 978-981-16-8364-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)