Abstract
Detection of an object and depth estimation is very crucial in the field of computer vision, facilitating tasks in the field of autonomous navigation, scene understanding and many more. There are lot challenges in the current existing technique such as occlusion and accuracy issues, impeding their real-world applicability. To surmount these limitations, the proposed work introduces an innovative approach that melds deep learning architectures with efficient computational methods. By fusing advanced object detection models with a sophisticated depth estimation network, the work proposed have achieved substantial enhancements in accuracy and precision. The proposed model pushes the envelope for real-time implementation, contributing to the advancement of object detection and depth estimation capabilities. This approach was augmented with a novel depth estimation technique, extracting diagonal pixel lengths and combining them with actual depths from the dataset. Subsequent analysis employed both linear and polynomial regression, revealing that the polynomial model (98% average accuracy) surpassed the linear model (80.96% accuracy). These findings highlighted the importance of capturing complex non-linear relationships between pixel length and object depth, showcasing YOLOv4’s robust object detection capabilities and emphasizing the significance of intricate depth estimation in visual cues.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Nagarajan, A., Gopinath, M.P.: Hybrid optimization-enabled deep learning for indoor object detection and distance estimation to assist visually impaired persons. Adv. Eng. Softw. 176, 103362 (2023). ISSN 0965–9978
Kumar, G.A., Lee, J.H., Hwang, J., Park, J., Youn, S.H., Kwon, S.: Lidar and camera fusion approach for object distance estimation in self-driving vehicles. Symmetry 12(2), 324 (2020)
Usmankhujaev, S., Baydadaev, S., Kwon, J.W.: Accurate 3D to 2D Object Distance Estimation from the Mapped Point Cloud Data. Sensors 23, 2103 (2023)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Girshick, R.: Fast r-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Nugraha, B.T., Su, S.-F.: Towards self-driving car using convolutional neural network and road lane detector. In: 2nd International Conference on Automation, Cognitive Science, Optics, Micro Electro-mechanical System, and Information Technology (ICACOMIT), pp. 65–69. IEEE (2017)
Cai, Y., Luan, T., Gao, H., et al.: Yolov4-5d: an effective and efficient object detector for autonomous driving. IEEE Trans. Instrum. Meas. 70, 1–13 (2021)
Zaheer, A., Rashid, M., Riaz, M.A., Khan, S.: Single-view reconstruction using orthogonal line-pairs. Comput. Vis. Image Underst. 172, 107–123 (2018)
Lee, D.C., Hebert, M., Kanade, T.: Geometric reasoning for single image structure recovery. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2136–2143. IEEE (2009)
Barinova, O., Konushin, V., Yakubenko, A., Lee, K.C., Lim, H., Konushin, A.: Fast automatic single-view 3-d reconstruction of urban scenes. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5303, pp. 100–113. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88688-4_8
Criminisi, A., Reid, I., Zisserman, A.: Single view metrology. Int. J. Comput. Vision 40, 123–148 (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Katiyar, R., Kumari, U., Panagar, K., Patil, K., Manjunath, B.M., Gowda, Y.J. (2024). Object Detection and Depth Estimation Using Deep Learning. In: Garg, D., Rodrigues, J.J.P.C., Gupta, S.K., Cheng, X., Sarao, P., Patel, G.S. (eds) Advanced Computing. IACC 2023. Communications in Computer and Information Science, vol 2053. Springer, Cham. https://doi.org/10.1007/978-3-031-56700-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-56700-1_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56699-8
Online ISBN: 978-3-031-56700-1
eBook Packages: Computer ScienceComputer Science (R0)