End-to-End Deep Learning Method with Disparity Correction for Stereo Matching

Zhou, Zhiyu; Liu, Mingxuan; Guo, Jiusen; Wang, Yaming; Yang, Donghe; Zhu, Zefei

doi:10.1007/s13369-023-07985-5

End-to-End Deep Learning Method with Disparity Correction for Stereo Matching

Research Article-Computer Engineering and Computer Science
Published: 14 June 2023

Volume 49, pages 3331–3345, (2024)
Cite this article

Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Zhiyu Zhou ORCID: orcid.org/0000-0003-4487-8192¹,
Mingxuan Liu¹,
Jiusen Guo¹,
Yaming Wang²,
Donghe Yang¹ &
…
Zefei Zhu³

170 Accesses
Explore all metrics

Abstract

At present, in end-to-end network combined with deep learning in stereo matching, how to perceive ample image details within an acceptable range of computational cost, so as to improve the final matching accuracy, has become a hot research topic. In this paper, we design a cascaded disparity correction network based on DispNet, which can refine the initial disparity image by combining the process output results in the trunk network. The method proposed in this paper is named ETE–DC–CNN (end to end–disparity correction–convolutional neural network). In the trunk network, the mixed dilated convolution module is employed to extract the features of the original image, and the feature image is constructed into a 3D disparity space volume through depth-separable convolution, after which the initial disparity image can be obtained through 2D encoder–decoder module. In the correction network, the gradient information of the initial feature image is combined to reconstruct the modified matching cost body guided by the initial disparity image and then a smaller encoder–decoder module is trained and processed to make the final result at the sub-pixel level. The algorithm in this paper has been verified in SceneFlow and KITTI 2012/2015 data sets. The method proposed in this article achieves 3.46, 2.12, 1.19, and 0.6 results in the non-occluded area and 4.04, 2.55, 1.58, and 0.7 results in the global area on the E_r (> 2 px), E_r (> 3 px), E_r (> 5 px), and E_pe indicators in the KITTI 2012 test set, which are higher than those of the backbone network DispNet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

End-to-End Disparity Estimation with Multi-granularity Fully Convolutional Network

Re-parameterization Making GC-Net-Style 3DConvNets More Efficient

Multilevel Disparity Reconstruction Network for Real-Time Stereo Matching

Article 21 May 2022

References

Ma, N.; Men, Y.; Men, C., et al.: Segmentation-based stereo matching using combinatorial similarity measurement and adaptive support region. Optik 137, 124–134 (2017)
Article ADS Google Scholar
Shahbazi, M.; Sohn, G.; Théau, J.: High-density stereo image matching using intrinsic curves. ISPRS J. Photogramm. Remote. Sens. 146, 373–388 (2018)
Article ADS Google Scholar
Tan, X.; Sun, C.; Pham, T.D.: Stereo matching based on multi-direction polynomial model. Signal Process. Image Commun. 44, 44–56 (2016)
Article Google Scholar
Hong, G.-S.; Kim, B.-G.: A local stereo matching algorithm based on weighted guided image filtering for improving the generation of depth range images. Displays 49, 80–87 (2017)
Article Google Scholar
Zeglazi, O.; Rziza, M.; Amine, A., et al.: A hierarchical stereo matching algorithm based on adaptive support region aggregation method. Pattern Recogn. Lett. 112, 205–211 (2018)
Article ADS Google Scholar
Hamzah, R.A.; Ibrahim, H.; Hassan, A.H.A.: Stereo matching algorithm for 3D surface reconstruction based on triangulation principle. In: 1st International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE), pp. 119–124 (2016)
Li, Y.; Zheng, S.; Wang, X., et al.: An efficient photogrammetric stereo matching method for high-resolution images. Comput. Geosci. 97, 58–66 (2016)
Article ADS Google Scholar
Li, Y.; Yang, C.; Zhong, W.; et al.: High throughput hardware architecture for accurate semi-global matching. In: 22nd Asia and South Pacific Design Automation Conference (ASP-DAC), pp. 641–646 (2017)
Žbontar, J.; Lecun, Y.: Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. 17(1), 2287–2318 (2016)
Google Scholar
Žbontar, J.; Lecun, Y.: Computing the stereo matching cost with a convolutional neural network. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1592–1599 (2015)
Yan, T.; Gan, Y.; Xia, Z., et al.: Segment-based disparity refinement with occlusion handling for stereo matching. IEEE Trans. Image Process. 28(8), 3885–3897 (2019)
Article ADS MathSciNet PubMed Google Scholar
Chen, L.; Fan, L.; Chen, J., et al.: A full density stereo matching system based on the combination of CNNs and slanted-planes. IEEE Trans. Syst. Man Cybern. Syst. 50(2), 397–408 (2020)
Article MathSciNet CAS Google Scholar
Dosovitskiy, A.; Fischer, P.; Ilg, E.; et al.: FlowNet: learning optical flow with convolutional networks. In: IEEE International Conference on Computer Vision (ICCV), pp. 2758–2766 (2015)
Mayer, N.; Ilg, E.; Häusser, P.; et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4040–4048 (2016)
Kendall, A.; Martirosyan, H.; Dasgupta, S.; et al.: End-to-end learning of geometry and context for deep stereo regression. In: IEEE International Conference on Computer Vision (ICCV), pp. 66–75 (2017)
Chang, J.R.; Chen, Y.S.: Pyramid stereo matching network. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5410–5418 (2018)
Zhao, H.; Shi, J.; Qi, X.; et al.: Pyramid scene parsing network. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6230–6239 (2017)
Zhang, F.; Prisacariu, V; Yang, R.; et al. GA-Net: guided aggregation net for end-to-end stereo matching. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 185–194 (2019)
Xu, H.; Zhang, J.: AANet: adaptive aggregation network for efficient stereo matching. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1956–1965 (2020)
Gidaris, S.; Komodakis, N.: Detect, replace, refine: deep structured prediction for pixel wise labeling. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7187–7196 (2017)
Pang, J.; Sun, W.; Ren, J.S.; et al.: Cascade residual learning: a two-stage convolutional neural network for stereo matching. In: IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 878–886 (2017)
Nguyen, T.P.; Jeon, J.W.: Wide context learning network for stereo matching. Signal Process. Image Commun. 78, 263–273 (2019)
Article Google Scholar
Stucker, C.; Schindler, K.: ResDepth: learned residual stereo reconstruction. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 707–716 (2020)
Zhang, T.; Zhang, X.; Ke, X., et al.: HOG-ShipCLSNet: a novel deep learning network with HOG feature fusion for SAR ship classification. IEEE Trans. Geosci. Remote Sens. (2020). https://doi.org/10.1109/TGRS.2021.3082759
Article Google Scholar
Zhang, T.W.; Zhang, X.L.: A mask attention interaction and scale enhancement network for SAR ship instance segmentation. IEEE Geosci. Remote Sens. Lett. 19, 1–5 (2022)
Google Scholar
Xu, X.; Zhang, X.; Shao, Z., et al.: A group-wise feature enhancement-and-fusion network with dual-polarization feature enrichment for SAR ship detection. Remote Sens. 14(20), 5276 (2022)
Article ADS Google Scholar
Xu, X.W.; Zhang, X.L.; Zhang, T.W.: Lite-YOLOv5: a lightweight deep learning detector for on-board ship detection in large-scene sentinel-1 SAR images. Remote Sens. 14(4), 1018 (2022)
Article ADS Google Scholar
Zhang, T.W.; Zhang, X.L.: HTC+ for SAR ship instance segmentation. Remote Sens. 14(10), 2395 (2022)
Article ADS MathSciNet Google Scholar
Zhang, T.; Zhang, X.: A full-level context squeeze-and-excitation ROI extractor for SAR ship instance segmentation. IEEE Geosci. Remote Sens. Lett. 19, 4506705 (2022)
Google Scholar
Mei, X.; Sun, X.; Zhou, M.; et al.: On building an accurate stereo matching system on graphics hardware. In: IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 467–474 (2011)
Geiger, A.; Lenz, P.; Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361 (2012)
Menze, M.; Geiger, A.: Object scene flow for autonomous vehicles. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3061–3070 (2015)
Kingma, D.; Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2014)
Guney, F.; Geiger, A.: Displets: resolving stereo ambiguities using object knowledge. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4165–4175 (2015)
Batsos, K.; Mordohai, P.: Recresnet: a recurrent residual CNN architecture for disparity map enhancement. In: International Conference on 3D Vision (3DV), pp. 238–247 (2018)
Zhong, Y.; Dai, Y.; Li, H.: Self-supervised learning for stereo matching with self-improving ability. arXiv preprint arXiv:1709.00930 (2017)

Download references

Acknowledgements

We appreciate the support of the National Key R&D Program of China (No. 2022YFC2803903) and the Key R&D Program of Zhejiang Province (No. 2021C03013).

Author information

Authors and Affiliations

School of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou, 310018, China
Zhiyu Zhou, Mingxuan Liu, Jiusen Guo & Donghe Yang
Zhejiang Key Laboratory of DDIMCCP, Lishui University, Lishui, China
Yaming Wang
School of Mechanical Engineering, Hangzhou Dianzi University, Hangzhou, 310018, China
Zefei Zhu

Authors

Zhiyu Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Mingxuan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jiusen Guo
View author publications
You can also search for this author in PubMed Google Scholar
Yaming Wang
View author publications
You can also search for this author in PubMed Google Scholar
Donghe Yang
View author publications
You can also search for this author in PubMed Google Scholar
Zefei Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhiyu Zhou.

Ethics declarations

Conflict of interest

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhou, Z., Liu, M., Guo, J. et al. End-to-End Deep Learning Method with Disparity Correction for Stereo Matching. Arab J Sci Eng 49, 3331–3345 (2024). https://doi.org/10.1007/s13369-023-07985-5

Download citation

Received: 13 September 2022
Accepted: 17 May 2023
Published: 14 June 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s13369-023-07985-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

End-to-End Deep Learning Method with Disparity Correction for Stereo Matching

Abstract

Access this article

Similar content being viewed by others

End-to-End Disparity Estimation with Multi-granularity Fully Convolutional Network

Re-parameterization Making GC-Net-Style 3DConvNets More Efficient

Multilevel Disparity Reconstruction Network for Real-Time Stereo Matching

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

End-to-End Deep Learning Method with Disparity Correction for Stereo Matching

Abstract

Access this article

Similar content being viewed by others

End-to-End Disparity Estimation with Multi-granularity Fully Convolutional Network

Re-parameterization Making GC-Net-Style 3DConvNets More Efficient

Multilevel Disparity Reconstruction Network for Real-Time Stereo Matching

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation