Depth Estimation of Traffic Scenes from Image Sequence Using Deep Learning

Liu, Xiaoxu; Yan, Wei Qi

doi:10.1007/978-3-031-26431-3_15

Xiaoxu Liu¹⁵ &
Wei Qi Yan¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13763))

Included in the following conference series:

Pacific-Rim Symposium on Image and Video Technology

319 Accesses
2 Citations

Abstract

Autonomous cars can accurately perceive the deployment of traffic scenes and the distance between visual objects in the scenarios through understanding the depth. Therefore, the depth estimation of scenes is a crucial step in the obstacle avoidance and pedestrian protection from autonomous vehicles. In this paper, a method for stereo depth estimation based on image sequences is introduced. In this project, we improve the performance of deep learning-based model by combining depth hints algorithm and MobileNetV2 encoder to enhance the loss function and increases computing speed. To the best of our knowledge, this is the first time MobileNetV2 is applied to depth estimation based on KITTI dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Li,Y., Tong, G., Yang, J., Zhang, L. Peng, H.: 3D point cloud scene data ac-quisition and its key technologies for scene understanding. Laser Optoelectron. Prog., 040002 (2019)
Google Scholar
Liu, L., et al.: Deep learning for generic object detection: a survey. Int. J. Comput. Vis. 128(2), 261–318 (2019). https://doi.org/10.1007/s11263-019-01247-4
Article MATH Google Scholar
Chen, H., Engkvist, O., Wang, Y., Olivecrona, M., Blaschke, T.: The rise of deep learning in drug discovery. Drug Discovery Today 23, 1241–1250 (2019)
Article Google Scholar
Husain, F., Dellen, B., Torras, C.: Scene understanding using deep learning, pp. 373–382. Academic Press, Cambridge (2017)
Google Scholar
Yang, S., Wang, W., Liu, C., Deng, W.: Scene understanding in deep learning-based end-to-end controllers for autonomous vehicles. IEEE Trans. Syst. Man Cybernet. Syst. 49, 53–63 (2019)
Article Google Scholar
Lecun, Y., Muller, U., Ben, J., Cosatto, E., Flepp, B.: Off-road obstacle avoidance through end-to-end learning. In: International Conference on Neural Information Processing Systems, pp. 739–746 (2005)
Google Scholar
Ohsugi, H., Tabuchi, H., Enno, H., Ishitobi, N.: Accuracy of deep learning, a machine-learning technology, using ultra-wide-field fundus ophthalmoscopy for detecting hematogenous retinal detachment. Sci. Rep. 7(1), 9425 (2017)
Article Google Scholar
Li, F., Deng, J., Li, K.: ImageNet: constructing a largescale image database. J. Vis. 9(8), 1037–1038 (2009)
Google Scholar
Laina, I., Rupprecht, C., Belagiannis, V., Tombari, F., Navab, N.: Deeper depth prediction with fully convolutional residual networks. In: International Conference on 3D Vision (3DV) (2016)
Google Scholar
Eigen, D., Fergus, R.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: IEEE International Conference on Computer Vision, pp. 2650–2658 (2014)
Google Scholar
Garg, R., B.G., V.K., Carneiro, G., Reid, I.: Unsupervised CNN for single view depth estimation: geometry to the rescue. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 740–756. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_45
Chapter Google Scholar
Godard, C., Aodha, O., Gabriel, J.: Unsupervised monocular depth estimation with left-right consistency. In: IEEE CVPR, pp. 270–279 (2017)
Google Scholar
Ranftl, R., Lasinger, K., Hafner, D., Schindler, K., Koltun, V.: Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE Trans. Pattern Anal. Mach. Intell. 01, 1 (2020)
Google Scholar
Miangoleh, S.M., Dille, S., Mai, L., Paris, S., Aksoy, Y.: Boosting monocular depth estimation models to high-resolution via content-adaptive multi-resolution merging. IEEE CVPR, pp. 9685–9694 (2021)
Google Scholar
Zhao, C., Sun, Q., Zhang, C., Tang, Y., Qian, F.: Monocular depth estimation based on deep learning: an overview. Sci. China Technol. Sci. 63(9), 1612–1627 (2020). https://doi.org/10.1007/s11431-020-1582-8
Article Google Scholar
Ochs, M., Kretz, A., Mester, R.: SDNet: semantically guided depth estimation network. In: Fink, G.A., Frintrop, S., Jiang, X. (eds.) DAGM GCPR 2019. LNCS, vol. 11824, pp. 288–302. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33676-9_20
Chapter Google Scholar
Darabi, A., Maldague, X.: Neural network based defect detection and depth es-timation in TNDE. NDT E Int. 35, 165–175 (2012)
Article Google Scholar
Zama Ramirez, P., Poggi, M., Tosi, F., Mattoccia, S., Di Stefano, L.: Geometry meets semantics for semi-supervised monocular depth estimation. In: Jawahar, C.V., Li, Hongdong, Mori, Greg, Schindler, Konrad (eds.) ACCV 2018. LNCS, vol. 11363, pp. 298–313. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20893-6_19
Chapter Google Scholar
Repala, V.K., Dubey, S.R.: Dual CNN models for unsupervised monocular depth estimation. In: Deka, Bhabesh, Maji, Pradipta, Mitra, Sushmita, Bhattacharyya, Dhruba Kumar, Bora, Prabin Kumar, Pal, Sankar Kumar (eds.) PReMI 2019. LNCS, vol. 11941, pp. 209–217. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-34869-4_23
Chapter Google Scholar
Honauer, K., Johannsen, O., Kondermann, D., Goldluecke, B.: A dataset and evaluation methodology for depth estimation on 4D light fields. In: Lai, S.H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10113, pp. 19–34. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54187-7_2
Chapter Google Scholar
Liu, F., Shen, C., Lin, G.: Deep convolutional neural fields for depth estimation from a single image. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5162–5170 (2015)
Google Scholar
Dan, X. et al. Multiscale continuous CRFs as sequential deep networks for monocular depth estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5354–5362 (2017)
Google Scholar
Liu, J., Li, Q., Cao, R., et al.: MiniNet: an extremely lightweight convolutional neural network for real-time unsupervised monocular depth estimation. ISPRS J. Photogrammetry Remote Sens. 166, 255–267 (2020)
Article Google Scholar
Hu, J., Zhang, Y.Z., Takayuki, O.: Visualization of convolutional neural networks for monocular depth estimation. In: International Conference on Computer Vision, pp. 3869–3878 (2019)
Google Scholar
Ding, X., Wang, Y., Zhang, J., et al.: Underwater image dehaze using scene depth estimation with adaptive color correction. In: OCEANS, pp.1–5 (2017)
Google Scholar
Torralba, A., Aude, O.: Depth estimation from image structure. IEEE Trans. Pattern Anal. Mach. Intell. 24, 1226–1238 (2002)
Article MATH Google Scholar
Song, W., et al.: A rapid scene depth estimation model based on underwater light attenuation prior for underwater image restoration. In: Pacific Rim Conference on Multimedia, pp.1–9 (2018)
Google Scholar
Rajagopalan, A., Chaudhuri, S., Mudenagudi, U.: Depth estimation and image restoration using defocused stereo pairs. IEEE Trans. Pattern Anal. Mach. Intell. 26, 1521–1525 (2014)
Article Google Scholar
Chen, P., et al.: Towards scene understanding: unsupervised monocular depth estimation with semantic-aware representation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2624–2632 (2019)
Google Scholar
Watson, J., Firman, M., Brostow, G.J., Turmukhambetov, D.: Self-supervised monocular depth hints. In: IEEE International Conference on Computer Vision, pp. 2162–2171 (2019)
Google Scholar
Godard, C., Aodha, O.M., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 270–279 (2017)
Google Scholar
Hirschmuller, H.: Stereo processing by semiglobal matching and mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 328–341 (2008)
Article Google Scholar
Godard, C., Aodha, O.M., Firman, M., Brostow, G.J.: Digging into self-supervised monocular depth estimation. In: IEEE International Conference on Computer Vision, pp. 3828–3838 (2019)
Google Scholar
Liu, X., Yan, W.Q.: Traffic-light sign recognition using capsule network. Multimed. Tools Appl. 80(10), 15161–15171 (2021). https://doi.org/10.1007/s11042-020-10455-x
Article Google Scholar
Liu, X., Yan, W.: Vehicle-related scene segmentation using CapsNets. In: IEEE IVCNZ (2020)
Google Scholar
Liu, X., Neuyen, M., Yan, W.Q.: Vehicle-related scene understanding using deep learning. In: Cree, Michael, Huang, Fay, Yuan, Junsong, Yan, Wei Qi (eds.) ACPR 2019. CCIS, vol. 1180, pp. 61–73. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-3651-9_7
Chapter Google Scholar
Liu, X.: Vehicle-related Scene Understanding Using Deep Learning. Master’s Thesis, Auckland University of Technology, New Zealand (2019)
Google Scholar
Mehtab, S., Yan, W.: FlexiNet: fast and accurate vehicle detection for autonomous vehicles-2D vehicle detection using deep neural network. In: ACM ICCCV (2021)
Google Scholar
Mehtab, S., Yan, W.: Flexible neural network for fast and accurate road scene perception. Multimed. Tools Appl. 81, 7169–7181 (2021). https://doi.org/10.1007/s11042-022-11933-0
Article Google Scholar
Mehtab, S., Yan, W., Narayanan, A.: 3D vehicle detection using cheap LiDAR and camera sensors. In: IEEE IVCNZ (2021)
Google Scholar
Yan, W.: Computational Methods for Deep Learning: Theoretic Practice and Applications. Springer, Berlin (2021)
Book MATH Google Scholar
Yan, W.: Introduction to Intelligent Surveillance: Surveillance Data Capture, Transmission, and Analytics. Springer, Berlin (2019)
Book Google Scholar
Gu, Q., Yang, J., Kong, L., Yan, W., Klette, R.: Embedded and real-time vehicle detection system for challenging on-road scenes. Opt. Eng. 56(6), 06310210 (2017)
Article Google Scholar
Ming, Y., Li, Y., Zhang, Z., Yan, W.: A survey of path planning algorithms for autonomous vehicles. Int. J. Commercial Veh. 14, 97–109 (2021)
Google Scholar
Shen, D., Xin, C., Nguyen, M., Yan, W.: Flame detection using deep learning. In: International Conference on Control, Automation and Robotics (2018)
Google Scholar
Xin, C., Nguyen, M., Yan, W.: Multiple flames recognition using deep learning. In: Handbook of Research on Multimedia Cyber Security, pp. 296–307 (2020)
Google Scholar
Luo, Z., Nguyen, M., Yan, W.: Kayak and sailboat detection based on the im-proved YOLO with transformer. In: ACM ICCCV (2022)
Google Scholar
Le, R., Nguyen, M., Yan, W.: Training a convolutional neural network for transportation sign detection using synthetic dataset. In: IEEE IVCNZ (2021)
Google Scholar
Pan, C., Yan, W.Q.: Object detection based on saturation of visual perception. Multimed. Tools Appl. 79(27–28), 19925–19944 (2020). https://doi.org/10.1007/s11042-020-08866-x
Article Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3354–3361 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Auckland University of Technology, Auckland, 1010, New Zealand
Xiaoxu Liu & Wei Qi Yan

Authors

Xiaoxu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Wei Qi Yan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Qi Yan .

Editor information

Editors and Affiliations

Xiamen University Malaysia, Sepang, Malaysia
Han Wang
Singapore Institute of Manufacturing Technology, Singapore, Singapore
Wei Lin
Charles Sturt University, Bathurst, NSW, Australia
Paul Manoranjan
Minjiang University, Fuzhou, China
Guobao Xiao
Yau Lee Holdings Ltd., Hong Kong, Hong Kong
Kap Luk Chan
Tsinghua University, Beijing, China
Xiaonan Wang
Nanyang Technological University, Singapore, Singapore
Guiju Ping
Nanyang Technological University, Singapore, Singapore
Haoge Jiang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, X., Yan, W.Q. (2023). Depth Estimation of Traffic Scenes from Image Sequence Using Deep Learning. In: Wang, H., et al. Image and Video Technology. PSIVT 2022. Lecture Notes in Computer Science, vol 13763. Springer, Cham. https://doi.org/10.1007/978-3-031-26431-3_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-26431-3_15
Published: 28 April 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26430-6
Online ISBN: 978-3-031-26431-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics