Abstract
Autonomous cars can accurately perceive the deployment of traffic scenes and the distance between visual objects in the scenarios through understanding the depth. Therefore, the depth estimation of scenes is a crucial step in the obstacle avoidance and pedestrian protection from autonomous vehicles. In this paper, a method for stereo depth estimation based on image sequences is introduced. In this project, we improve the performance of deep learning-based model by combining depth hints algorithm and MobileNetV2 encoder to enhance the loss function and increases computing speed. To the best of our knowledge, this is the first time MobileNetV2 is applied to depth estimation based on KITTI dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Li,Y., Tong, G., Yang, J., Zhang, L. Peng, H.: 3D point cloud scene data ac-quisition and its key technologies for scene understanding. Laser Optoelectron. Prog., 040002 (2019)
Liu, L., et al.: Deep learning for generic object detection: a survey. Int. J. Comput. Vis. 128(2), 261–318 (2019). https://doi.org/10.1007/s11263-019-01247-4
Chen, H., Engkvist, O., Wang, Y., Olivecrona, M., Blaschke, T.: The rise of deep learning in drug discovery. Drug Discovery Today 23, 1241–1250 (2019)
Husain, F., Dellen, B., Torras, C.: Scene understanding using deep learning, pp. 373–382. Academic Press, Cambridge (2017)
Yang, S., Wang, W., Liu, C., Deng, W.: Scene understanding in deep learning-based end-to-end controllers for autonomous vehicles. IEEE Trans. Syst. Man Cybernet. Syst. 49, 53–63 (2019)
Lecun, Y., Muller, U., Ben, J., Cosatto, E., Flepp, B.: Off-road obstacle avoidance through end-to-end learning. In: International Conference on Neural Information Processing Systems, pp. 739–746 (2005)
Ohsugi, H., Tabuchi, H., Enno, H., Ishitobi, N.: Accuracy of deep learning, a machine-learning technology, using ultra-wide-field fundus ophthalmoscopy for detecting hematogenous retinal detachment. Sci. Rep. 7(1), 9425 (2017)
Li, F., Deng, J., Li, K.: ImageNet: constructing a largescale image database. J. Vis. 9(8), 1037–1038 (2009)
Laina, I., Rupprecht, C., Belagiannis, V., Tombari, F., Navab, N.: Deeper depth prediction with fully convolutional residual networks. In: International Conference on 3D Vision (3DV) (2016)
Eigen, D., Fergus, R.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: IEEE International Conference on Computer Vision, pp. 2650–2658 (2014)
Garg, R., B.G., V.K., Carneiro, G., Reid, I.: Unsupervised CNN for single view depth estimation: geometry to the rescue. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 740–756. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_45
Godard, C., Aodha, O., Gabriel, J.: Unsupervised monocular depth estimation with left-right consistency. In: IEEE CVPR, pp. 270–279 (2017)
Ranftl, R., Lasinger, K., Hafner, D., Schindler, K., Koltun, V.: Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE Trans. Pattern Anal. Mach. Intell. 01, 1 (2020)
Miangoleh, S.M., Dille, S., Mai, L., Paris, S., Aksoy, Y.: Boosting monocular depth estimation models to high-resolution via content-adaptive multi-resolution merging. IEEE CVPR, pp. 9685–9694 (2021)
Zhao, C., Sun, Q., Zhang, C., Tang, Y., Qian, F.: Monocular depth estimation based on deep learning: an overview. Sci. China Technol. Sci. 63(9), 1612–1627 (2020). https://doi.org/10.1007/s11431-020-1582-8
Ochs, M., Kretz, A., Mester, R.: SDNet: semantically guided depth estimation network. In: Fink, G.A., Frintrop, S., Jiang, X. (eds.) DAGM GCPR 2019. LNCS, vol. 11824, pp. 288–302. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33676-9_20
Darabi, A., Maldague, X.: Neural network based defect detection and depth es-timation in TNDE. NDT E Int. 35, 165–175 (2012)
Zama Ramirez, P., Poggi, M., Tosi, F., Mattoccia, S., Di Stefano, L.: Geometry meets semantics for semi-supervised monocular depth estimation. In: Jawahar, C.V., Li, Hongdong, Mori, Greg, Schindler, Konrad (eds.) ACCV 2018. LNCS, vol. 11363, pp. 298–313. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20893-6_19
Repala, V.K., Dubey, S.R.: Dual CNN models for unsupervised monocular depth estimation. In: Deka, Bhabesh, Maji, Pradipta, Mitra, Sushmita, Bhattacharyya, Dhruba Kumar, Bora, Prabin Kumar, Pal, Sankar Kumar (eds.) PReMI 2019. LNCS, vol. 11941, pp. 209–217. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-34869-4_23
Honauer, K., Johannsen, O., Kondermann, D., Goldluecke, B.: A dataset and evaluation methodology for depth estimation on 4D light fields. In: Lai, S.H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10113, pp. 19–34. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54187-7_2
Liu, F., Shen, C., Lin, G.: Deep convolutional neural fields for depth estimation from a single image. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5162–5170 (2015)
Dan, X. et al. Multiscale continuous CRFs as sequential deep networks for monocular depth estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5354–5362 (2017)
Liu, J., Li, Q., Cao, R., et al.: MiniNet: an extremely lightweight convolutional neural network for real-time unsupervised monocular depth estimation. ISPRS J. Photogrammetry Remote Sens. 166, 255–267 (2020)
Hu, J., Zhang, Y.Z., Takayuki, O.: Visualization of convolutional neural networks for monocular depth estimation. In: International Conference on Computer Vision, pp. 3869–3878 (2019)
Ding, X., Wang, Y., Zhang, J., et al.: Underwater image dehaze using scene depth estimation with adaptive color correction. In: OCEANS, pp.1–5 (2017)
Torralba, A., Aude, O.: Depth estimation from image structure. IEEE Trans. Pattern Anal. Mach. Intell. 24, 1226–1238 (2002)
Song, W., et al.: A rapid scene depth estimation model based on underwater light attenuation prior for underwater image restoration. In: Pacific Rim Conference on Multimedia, pp.1–9 (2018)
Rajagopalan, A., Chaudhuri, S., Mudenagudi, U.: Depth estimation and image restoration using defocused stereo pairs. IEEE Trans. Pattern Anal. Mach. Intell. 26, 1521–1525 (2014)
Chen, P., et al.: Towards scene understanding: unsupervised monocular depth estimation with semantic-aware representation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2624–2632 (2019)
Watson, J., Firman, M., Brostow, G.J., Turmukhambetov, D.: Self-supervised monocular depth hints. In: IEEE International Conference on Computer Vision, pp. 2162–2171 (2019)
Godard, C., Aodha, O.M., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 270–279 (2017)
Hirschmuller, H.: Stereo processing by semiglobal matching and mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 328–341 (2008)
Godard, C., Aodha, O.M., Firman, M., Brostow, G.J.: Digging into self-supervised monocular depth estimation. In: IEEE International Conference on Computer Vision, pp. 3828–3838 (2019)
Liu, X., Yan, W.Q.: Traffic-light sign recognition using capsule network. Multimed. Tools Appl. 80(10), 15161–15171 (2021). https://doi.org/10.1007/s11042-020-10455-x
Liu, X., Yan, W.: Vehicle-related scene segmentation using CapsNets. In: IEEE IVCNZ (2020)
Liu, X., Neuyen, M., Yan, W.Q.: Vehicle-related scene understanding using deep learning. In: Cree, Michael, Huang, Fay, Yuan, Junsong, Yan, Wei Qi (eds.) ACPR 2019. CCIS, vol. 1180, pp. 61–73. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-3651-9_7
Liu, X.: Vehicle-related Scene Understanding Using Deep Learning. Master’s Thesis, Auckland University of Technology, New Zealand (2019)
Mehtab, S., Yan, W.: FlexiNet: fast and accurate vehicle detection for autonomous vehicles-2D vehicle detection using deep neural network. In: ACM ICCCV (2021)
Mehtab, S., Yan, W.: Flexible neural network for fast and accurate road scene perception. Multimed. Tools Appl. 81, 7169–7181 (2021). https://doi.org/10.1007/s11042-022-11933-0
Mehtab, S., Yan, W., Narayanan, A.: 3D vehicle detection using cheap LiDAR and camera sensors. In: IEEE IVCNZ (2021)
Yan, W.: Computational Methods for Deep Learning: Theoretic Practice and Applications. Springer, Berlin (2021)
Yan, W.: Introduction to Intelligent Surveillance: Surveillance Data Capture, Transmission, and Analytics. Springer, Berlin (2019)
Gu, Q., Yang, J., Kong, L., Yan, W., Klette, R.: Embedded and real-time vehicle detection system for challenging on-road scenes. Opt. Eng. 56(6), 06310210 (2017)
Ming, Y., Li, Y., Zhang, Z., Yan, W.: A survey of path planning algorithms for autonomous vehicles. Int. J. Commercial Veh. 14, 97–109 (2021)
Shen, D., Xin, C., Nguyen, M., Yan, W.: Flame detection using deep learning. In: International Conference on Control, Automation and Robotics (2018)
Xin, C., Nguyen, M., Yan, W.: Multiple flames recognition using deep learning. In: Handbook of Research on Multimedia Cyber Security, pp. 296–307 (2020)
Luo, Z., Nguyen, M., Yan, W.: Kayak and sailboat detection based on the im-proved YOLO with transformer. In: ACM ICCCV (2022)
Le, R., Nguyen, M., Yan, W.: Training a convolutional neural network for transportation sign detection using synthetic dataset. In: IEEE IVCNZ (2021)
Pan, C., Yan, W.Q.: Object detection based on saturation of visual perception. Multimed. Tools Appl. 79(27–28), 19925–19944 (2020). https://doi.org/10.1007/s11042-020-08866-x
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3354–3361 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, X., Yan, W.Q. (2023). Depth Estimation of Traffic Scenes from Image Sequence Using Deep Learning. In: Wang, H., et al. Image and Video Technology. PSIVT 2022. Lecture Notes in Computer Science, vol 13763. Springer, Cham. https://doi.org/10.1007/978-3-031-26431-3_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-26431-3_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26430-6
Online ISBN: 978-3-031-26431-3
eBook Packages: Computer ScienceComputer Science (R0)